Blog

Blog Categories

   A common and very challenging problem in machine learning is overfitting, and it comes in many different appearances. It is one of the major aspects of training the model. Overfitting occurs when the model is capturing too much noise in the training data set which leads to bad predication accuracy when applying the model to new data. One of the ways to avoid overfitting is regularization technique. In this tutorial, we...
Today, data science specialists are among the most sought-after in the labor market. Being able to find significant insights in a huge amount of information, they help companies and organizations to optimize the structure's work.  The field of data science is rapidly developing and the demand for talent is changing. We use job offerings to analyze the current demand for talent. After performing a  first analysis in 2017 , this...
For the past few years, tasks involving text and speech processing have become really hot-trendy. Among the various researches belonging to the fields of Natural Language Processing and Machine Learning, sentiment analysis ranks really high. Sentiment analysis allows identifying and getting subjective information from the source data using data analysis and visualization, ML models for classification, text mining and analysis. This helps...
Introduction Nowadays PostgreSQL is probably one of the most powerful relational databases among the open-source solutions. Its functional capacities are no worse than Oracle’s and definitely way ahead of the MySQL. So if you are working on apps using Python, someday you will face the need of working with databases. Luckily, Python has quite a wide amount of packages that provide an easy way of connecting and using databases. In...
The way other people think about one or another product or service has a big impact on our everyday process of making decisions. Earlier, people relied on the opinion of their friends, relatives, or products and services reposts, but the era of the Internet has made significant changes. Today opinions are collected from different people around the world via reviewing e-commerce sites as well as blogs and social nets. To transform gathered...
Introduction Exploratory data analysis (EDA) is an approach to data analysis to summarize the main characteristics of data. It can be performed using various methods, among which data visualization takes a great place. The idea of EDA is to recognize what information can data give us beyond the formal modeling or hypothesis testing task. In other words, if initially we don’t have at all or there are not enough priori ideas about...
In the modern world, the information flow which befalls on a person is daunting. This led to a rather abrupt change in the basic principles of data perception. Therefore visualization is becoming the main tool for presenting information. With the help of visualization, information is presented to the audience in a more accessible, clear, visual form. Properly chosen method of visualization can make it possible to structure large data arrays,...
The more carefully you process the data and go into details, the more valuable information you can get for your benefit. Data visualization is an efficient and handy tool for gaining insights from data. Moreover, you can make the data far more understandable, colorful and pleasant with the help of visualization tools. As data is changing every second, it is an urgent task to investigate it carefully and get the insights as fast as...
What is Exploratory Data Analysis Exploratory data analysis (EDA) is a powerful tool for a comprehensive study of the available information providing answers to basic data analysis questions. What distinguishes it from traditional analysis based on testing a priori hypothesis is that EDA makes it possible to detect — by using various methods — all potential systematic correlations in the...
Since the dawn of the digital age, the amount of data stored on servers has risen dramatically. More and more firms are looking for talent that can handle their datasets and generate insights for business decisions. Data scientists are among the most popular for this task. Google Trends shows that the global volume of the search term “Data Scientist” has tripled over the last 5 years - but how does the increasing demand translate...
Since the dawn of the digital age, the amount of data stored on servers has risen dramatically. With this increase, more and more firms are looking for talent that can handle their datasets and generate insights for business decisions. Google Trends shows that the global volume of the search term “Data Analyst” nearly tripled over the last 5 years. How does the increasing demand translate into earnings of data analysts in...
Companies use machine learning to improve their business decisions. Algorithms select ads, predict consumers’ interest or optimize the use of storage. However, few stories of machine learning applications for public policy are out there, even though public employees often make comparable decisions. Similar to the business examples, decisions by public employees often try to optimize the use of limited resources. Algorithms may assist...