As a Data Scientist, I have worked on a wide range of projects in different fields.
I am primarily involved in Natural Language Processing (NLP) and all possibilities of using texts in building artificial intelligence systems.
Selected problems I solved using machine learning
- assessing the comprehensibility of Polish legal acts,
- prediction of insurance fraud,
- prediction of transactional fraud,
- exploration of factors affecting the success of advertising campaigns.
In my analyses I use
- basic tools of descriptive statistics,
- tools of inferential statistics: parametric and non-parametric tests,
- regression models: linear, logistic and regularized models,
- single models: e.g. decision trees, support vector machines,
- ensemble learning models: random forests, boosting of all kinds,
- unsupervised learning,
- neural networks,
- natural language processing (NLP).
For research work I use
- Python language (pandas, numpy, matplotlib, scikit-learn, TensorFlow, Keras, selenium, spacy, nltk),
- R language (tidyverse: dplyr, tidyr, ggplot2; tidymodels, tidytext),
- SQL language,
- git for version control,
- markdown writing language.
I am also familiar with, though not in the habit of using
- dashboard software,
- basic html and css.