Since the emergence of the first forms of writing until now, people have had the need to collect information; information that, logically, has been accumulated along years, becoming more abundant and profuse over time. Today, the growth of the technology sector has also caused a disproportionate increase in the volume of information data.
Therefore, it is required more sophisticated and complex storage data. Due to the boom of these information technologies, organizations have had to deal new challenges, allowing them to analyze, discover and understand the information beyond this system. And that is where the big data appears.
In this sense, big data is the computerized processing of large amounts of information; data (structured, unstructured and semi structured) that exceeds the capacity of conventional software to be captured, managed and processed within a reasonable time.
So, we refer to Big Data when there is a large volume of information, with a variety of features. It is therefore necessary that the response speed of the system would be as quick as possible, in order to obtain the right information at the right time. These are, precisely, the three main characteristics of a big data project.
As there is a wide variety of types of data to analyze, it is necessary to classify this data into different categories, such us Web and Social Media content, Machine-to-Machine (M2M), Big Transaction Data, Biometrics data or Human generated data.
It is important to stand out that the cities are full of data streams. This is where the concept of smart city appears. In it, the innovative use of data helps to provide better and more inventive services to improve people’s lives. And it is estimated that about 3000 million people live currently in cities.
In 2050, it is expected that 70% of the population will reside in cities too, by the United Nations. Therefore, people will demand more and more to these cities. Being a project of big data, smart cities would need to capture, store, process and analyze large amounts of data from many different sources, to transform them into useful knowledge.
The extraction of data: Data mining
Taking the necessary data stored, then it is important to consider different techniques of data analysis, such as the association, clustering, text analytics or data mining. Data mining is one of the most important, because it is the process of extracting data, analyzing it from many dimensions or perspectives and producing a summary of the information in a useful form.
Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. There are two types of data mining: descriptive, which gives information about existing data; and predictive, which makes forecasts based on the data. To reach this end, data mining uses statistics and, in some cases, Artificial Intelligence and Neural Networks algorithms.
Basically, data mining arises to try to help understand the content of big data. It is mainly based on:
- Extract, transform, and load transaction data onto the data warehouse system (this one is a process of centralized data management and retrieval).
- Store and manage the data in a multidimensional database system.
- Present the data in a useful format (in a graph or a table).
- Analyze the data by application software.
- Finally, provide data access to business analysts and information technology professionals.
Although data mining is a relatively new concept, the technology is not. Companies have used powerful computers to analyze market research reports for years. However, continuous innovations are dramatically increasing the accuracy of analysis, driving down the cost.
Data mining is presented as an emerging technology, with several advantages: it could be a good meeting point between researchers and business people; on the other hand, it could save large amounts of money to a company and would open up new business opportunities. That is why data mining is so important. In this point, the next question would be: so, how data mining is used to generate Business Intelligence?
Nowadays, data mining is primarily used by companies with a strong consumer view. Business applications trust on data mining software solutions; due to that, data mining tools are today an integral part of enterprise decision-making and risk management in a company. In this point, acquiring information through data mining alluded to a Business Intelligence (BI).
How data mining is used to generate Business Intelligence
Being able to use the information you gather is at least as important as gathering it. So, it is therefore important to have Business Intelligence (BI). That is, Business Intelligence is the ability to transform data into information and information into knowledge. It is the best way to optimize the decision-making process in business.
In this sense, Business Intelligence is a set of methodologies, applications and technologies to collect, refine and transform this data from transactional systems and unstructured information (internal and external to the company), in structured information for direct exploitation or for analysis.
Business Intelligence combines data analysis applications, including ad hoc analysis and querying, enterprise reporting, online analytical processing (OLAP), mobile BI, real-time BI, operational BI, cloud and software as a service BI, open source BI, collaborative BI and location intelligence.
BI technology also includes data visualization, tools for building BI dashboards and performance scorecards and key performance indicators. In addition, these tools generate findings that are ultimately used to gain competitive advantage over rivals, better and efficient business operations and better survivability and risk management.
Data mining tools provide better customers relationship management, too, through mining real habits and diverse patterns. In resume, Business Intelligence strategy should be used to apply the knowledge to maximize the benefits of the company.
Thus, Business Intelligence acts as a strategic factor for a business, providing insider information to respond to business problems: entering new markets, financial control, cost optimization, production planning, analysis of customer profiles, profitability… That is how data mining is used to generate Business Intelligence.
For example, the potential benefits of Business Intelligence programs include accelerating and improving decision making; optimizing internal business processes; increasing operational efficiency; driving new revenues; and gaining competitive advantages over business rivals. BI systems can also help companies identify market trends and spot business problems that need to be addressed.
In resume, Business Intelligence (BI) is an increasingly popular term representing the tools and systems that play a key role in the strategic planning process of the corporation by turning knowledge into profit.
Data mining and Business Intelligence have made possible that various industries, such as sales and marketing, healthcare organization or financial institutions, could have a quick analysis of data and thereby, improving the quality of decision making process in their industries.
In addition, data mining technologies have bright future in business applications, making possible new opportunities by automated prediction of trends and behaviours in these businesses. So, how data mining is used to generate Business Intelligence is a concept that we will hear a lot during these years: it is the future.