Data Science

As a Data Scientist, I have worked on a wide range of projects in different fields.

I am primarily involved in Natural Language Processing (NLP) and all possibilities of using texts in building artificial intelligence systems.

Selected problems I solved using machine learning

  • assessing the comprehensibility of Polish legal acts,
  • prediction of insurance fraud,
  • prediction of transactional fraud,
  • exploration of factors affecting the success of advertising campaigns.

In my analyses I use

  • basic tools of descriptive statistics,
  • tools of inferential statistics: parametric and non-parametric tests,
  • regression models: linear, logistic and regularized models,
  • single models: e.g. decision trees, support vector machines,
  • ensemble learning models: random forests, boosting of all kinds,
  • unsupervised learning,
  • neural networks,
  • natural language processing (NLP).

For research work I use

  • Python language (pandas, numpy, matplotlib, scikit-learn, TensorFlow, Keras, selenium, spacy, nltk),
  • R language (tidyverse: dplyr, tidyr, ggplot2; tidymodels, tidytext),
  • SQL language,
  • Excel,
  • git for version control,
  • markdown writing language.

I am also familiar with, though not in the habit of using

  • SAS,
  • dashboard software,
  • basic html and css.