- This topic has 0 replies, 1 voice, and was last updated 5 months, 4 weeks ago by Idowu.
- February 12, 2020 at 12:30 am #85673Participant@idowu
Most successful companies in the world today rely on data to make business decisions and drive their way to the top of the market. Data has become extremely valuable in today’s business ecosystem – leading to the coining of the phrase “data is the new oil”.
As competition has continued to increase, older and bigger companies keep innovating strategies to hold on to the market. These strategies are often related to data acquisition (and interpretation) from their large user bases. In order for startups to survive in today’s data-centric world, they must collect data as well to generate market insights prior to the commencement of business operations.
What is Data Acquisition
Data acquisition is any action aimed at collecting and collating data targeted at specific needs from different sources – such data could be primary (from the original source) or secondary (from already collected data) in origin.
However, despite the need for start-ups and pre-existing companies to harness data, one major challenge remains: How to strategize for data collection and get enough data.
The Need for Data Collection
Data collection is usually needed for one or more of the following reasons:
- Data can be used to acquire market intelligence – to draw market insights and make marketing decisions.
- Production decisions – to generate product insights, to assess consumers’ taste and wants, for product design and optimization and generation of new products.
- Research purposes – which could be academic, government or non-government oriented.
- Gaining access to information about competitors – start-ups generally lack access to enough data, one way to stay competitive is usually to get data about the performance of existing competitors’ products.
- For individual use – we give out raw data and consume refined ones on a daily basis whenever we use application software that interact with the servers.
Data Acquisition Strategies
Data collection can be done in one of two ways or in a combination of both. This includes:
- Web-based data collection; and
- Manual data collection
Each of these collection methods can still be carried out in various ways, which we’ll be taking a look at.
Web-Based Data Collection
Web-based data collection is the act of automating the process of data collection. It involves the exploration of the vast pool of technologies available as means of generating data. This usually involves the different strategies of getting data over the internet.
With the advent of the internet, the world has become a socially linked globe, allowing for interrelationship between people from various regions. People, countries and continents communicate and relate over the web daily and these conversations are stored in databases – which are either accessible to the public or are classified (not available for public use because they containing sensitive information).
Examples of public data are those available in government databases, they’re always available at no cost, for instance, websites like data.world holds a catalogue of data sets for different situations, each of them with its own source. The World Bank also holds different data which suites different purposes. There are many websites that give you access to public data. These data are always in more organized formats of CSV or Excel sheets.
Good examples of classified or private information are those containing investigative data, some medical records, security data and financial statements which are all usually managed by specialized or private agencies, e.g FBI. Unlike public data, they’re not made available to the public.
Various ways do exists for acquiring data over the internet, they include:
- Through search engines – this is more like a shotgun approach, it’s more conventional and it involves a critical and analytical way of surfing different search engines for websites that offer public data. You can decide to make use of search engines like Google, Bing and a lot more to achieve your aim, so far they point you to where you can get relevant data. For example, if you search for something like “food and beverage data sets” or “food and beverage csv” on Google, it’ll bring a list of websites that possess datasets related to foods.
- By using online data acquisition platforms – there are several online data acquisition platforms available over the web, most of which are usually on a paid basis. How they work is; they pay each respondent (people who fill the surveys) certain amount of money for filling up a particular form. They select platform users based on certain criteria and invite them to participate in a particular survey. An example of such platforms is MOBROG. An advantage of online platforms is that you’re paying for the data and you’re sure going to get the right group of respondents to fill the data within a short space of time.
- Directly from specific agency websites – there are a number of agencies to get data from. But sometimes, you might want to be more specific with your data acquisition quest and you don’t want to use the short gun approach, probably because you now know what you’re looking for and where to get it. A good way to do this will be to visit specific websites which are filled with databases about the specific stuffs you’re gathering data about. An example will be surfing the WHO or FAO database for public health and food security related data, or directly surfing a government’s database such as the Stock Exchange.
- Creation of Google forms – Google forms are great ways of gathering data, considering the fact that when shared, the loop can just go on and on. Some industries, especially small start-ups resort to this method. One advantage of using Google forms is that, as the Data Scientist, you get to design the forms yourself in order to suite exactly what you want to assess. However, a major setback could be the issue of scalability – the inability of the forms to reach enough target audience, which can result in low amount of data.
- Acquisition with technical products – this involves the creation of technical products, such as application software which meets specific needs of target respondents and are connected to a database for data storage. A good example will be in the acquisition of unstructured data such as people’s faces or images, which are mostly used to improve the performance of machines in Artificial Intelligence.
- Web scrapping – this is the act of using scripted languages like Python to extract relevant data from a website. This is done in a number of ways, which includes writing programs that will hit an API (Application Programming Interface) and request access to defined data – usually by providing what we call a token, which is a unique pass for hitting such API. Most APIs are free, while some are on a subscription basis. Another way of scrapping the web will be to write specific codes that will scrap inspected elements from a website. This doesn’t usually require hitting an API.
However, an advantage of the API method is that large volume of data can be generated within a short space of time. Another cool advantage is that you get access to data that are yet unavailable to your competitors. Packages and frameworks like Beautiful SOUP and REST API are examples of packages used to scrap the web. The data generated are usually in JSON formats, which can be converted into more organized and readable formats like CSV.
Manual Data Acquisition
Manual data collection involves the use of means other than web-based methods to acquire data. It’s usually paper based or meeting based and this method is as old as man. Despite the availability of the web-based approach, this method is still being used, and it doesn’t look like it’s about to be out of the way, in fact, they’re the best methods of collecting primary data.
Although, this method can be laborious and time consuming, it’s still one of the cheapest methods of acquiring data. The major methods involved in manual data collection include:
- Through questionnaires – this is the most common manual approach. It involves the creation of self-defined questions on papers and reaching out to a particular audience who will fill up the data. This manual method is particularly common during academic research and pilot studies.
- Personal and telephone interviews – interviews involve a personal conversation between the researcher and a respondent. This conversation could be formal or informal and could be personal (one on one) or via the telephone.
Although, several methods do exist through which data can be collected, it is worthy to note that a combination of two or more of these approaches can be highly effective, depending on urgency and needs.
- You must be logged in to reply to this topic.