Case 2.1: Facebook’s Big Data Storage, What is Facebook's main motivation

Question # 00862870 Posted By: wildcraft Updated on: 11/05/2024 10:15 PM Due on: 11/06/2024
Subject Computer Science Topic General Computer Science Tutorials:
Question
Dot Image

Read “Case 2.1: Facebook’s Big Data Storage” and write an essay that answers the following questions:

1. What is Facebook's main motivation for creating new database management systems?

2. What are some of the challenges that Facebook faces when it comes to managing big data?

3. How does Facebook's use of Scuba and Cubrick database management systems improve its advertising efforts?

4. What types of data does Facebook collect from its users to inform its advertising efforts?

5. How does Facebook use machine learning to analyze user data and improve its advertising targeting?

6. What are the benefits and potential drawbacks of Facebook's use of big data and machine learning for advertising?

7. How does Facebook ensure that user data is protected and not misused for advertising purposes?

8. What are some potential ethical concerns related to Facebook's use of big data and machine learning for advertising?

9. How can Facebook improve transparency and communication with its users regarding its use of their data for advertising purposes?

10. What are some potential future developments in the field of big data and machine learning that could impact Facebook's advertising efforts? 

Requirements:

· There is no specific page requirement for your analysis. Instead, your work will be evaluated based on how thoroughly it addresses each of the questions that have been outlined for you.

· You must utilize proper APA formatting and citations throughout your paper. If you use any supporting evidence from external sources, it is imperative that you provide accurate citations for each reference.

· You must include a minimum of two sources from scholarly articles or business periodicals, aside from the course textbook. 

 

Case 2.1: Facebook’s Big Data Storage

Facebook uses many database management systems to sort and store big data. These database management systems include Hive, Hadoop, and Operational Data Store (ODS). These systems are a nontraditional relational database and process data for analysis, like who to advertise to, not to actually serve users. Hadoop is the base for Facebook’s Database Management Systems, but Facebook is constantly looking and trying to create new database management systems to become more efficient. There is so much data that Facebook receives every day and every minute. On aver- age Facebook’s system “processes 2.5 billion pieces of content and 500+ terabytes of data per day. It is pulling in 2.7 billion like actions and 300 million photos per day, and it scans roughly 105 terabytes of data each half hour” (Constine, 2012). The database management systems they have traditionally used are not as efficient as data has grown. In order to gain an advantage, Facebook is working to create bet- ter database management systems.

Facebook’s data warehouses, also known as the Hive, are capable of storing more than 300 petabytes of data in more than 800,000 tables. It runs more than 600,000 queries and 1 million map-reduce jobs per day. The common query engines over Hive include Presto, HiveQL, Hadoop, and Giraph. The data is used for a vari- ety of applications, from traditional database processing to image analytics, machine learning, and real-time interactive analytics (Wiener and Bronson, 2014). Facebook uses operational data store to support data mining of operational data or base data that is summarized for a data warehouse. ODS stores about 2 billion time series records. The data is loaded into an enterprise data architecture after being extracted from operational databases. The process includes standardization, cleansing, con- solidation, and transformation. It is used most commonly in alerts and dashboards and for troubleshooting system metrics with 1–5 min of time lag. There are about 40,000 queries per second (Wiener and Bronson, 2014).

Two database management systems that Facebook specifically created to sort data faster for advertising are Scuba and Cubrick. These database management sys- tems work fast to find the data they can use to purposefully advertise on Facebook. This helps with knowing what users want in that instance. The time it takes to pro- cess this data has shortened dramatically. Facebook receives millions of data every second, but Facebook’s drive to create new database management systems is to significantly advance the speed of information analysis, and they are always work- ing toward this goal.

Scuba is just “one of many ‘Big Data’ software platforms Facebook has pro- duced to control the information generated by its online operation – platforms that push the boundaries of distributed computing, the art of training hundreds or even thousands of computers on a single task” (Metz, 2017). Scuba is mainly used at Facebook for performance monitoring, trend spotting, and pattern mining. It grabs the data and sorts it faster and deletes data by itself. They have to delete certain data sometimes because it is either old, or there is limited space in the table. It deletes the oldest data in that table and provides the most relevant data. Scuba keeps all data in “high-speed memory systems running across hundreds of computer servers – not the hard disks, the memory systems – and this means you can query the data in near real-time” (Metz, 2017). By sorting current data and finding what users are doing now, Facebook can choose which advertisements users will respond to the best. Facebook can gain a lot more money when an advertisement is liked, clicked on, commented, or shared. Advertising strategically makes more money for Facebook. Data scientists can use database management systems to analyze how effective Facebook is running and the behavior of its users. This means Facebook can feed data directly to users (Metz, 2017).

Cubrick is used more for sorting generic data. They input a table of information into the database management system, and Cubrick takes all the attributes and con- nects the similar ones. The attributes are connected in relation with the colored dots as the user and the attributes are connected by the bricks, which form one cube. It is like graphing but with multiple graphs connected by the same range and dimension. This enables an improved and lean database mechanism only able to function over primitive data types. They sort the data and are able to rapidly group people together based on similar characteristics like gender or region. That is how they can group certain people to advertise.

Facebook uses these database management systems because they are faster and get them just the data they want. This helps them advertise to what the consumer wants now, not what they wanted in the past. They group people together using the data, and this can be based on simple things like demographics, people’s gender, age, income, housing type, and education level. All this information is able to be filled out on a person’s profile and can be public, like we mentioned before. They can also market based on the geographic market, marketing only in certain coun- tries, states, or regions because of the impacts from having different weather in areas and certain distinct cultures and values in an area. Again, they gain this infor- mation when you fill out a profile, specifying where you live and if you have your location on, they can track where you go. Price segmentation can be used as well but can be harder to track. They can decide what products to market to users such as cheap products, medium-priced, or expensive products. In order to do this, they target users based on what they put on their profile as their occupation, their degree level, and posts. After they get the data, they put it into their prediction algorithm.

Facebook uses the machine learning algorithm to rank what they think you would like to see in your feed. The algorithm does not just predict whether you will actu- ally “hit the like button on a post based on your past behavior. It also predicts whether you’ll click, comment, share, or hide it, or even mark it as spam. It will predict each of these outcomes, and others, with a certain degree of confidence, then combine them all to produce a single relevancy score that’s specific to both you and that post” (Oremus, 2016). This is how they can choose which advertisements they should advertise to people. The higher chance that a consumer will respond to the advertisement in any positive way will, in turn, make the advertisement a success. If the user shares it, likes it, or tags someone recommending the product, it allows other people to see the ad and make a purchase as well. There is a high chance that many friends of someone who likes the product will also like the product, making this system very effective.

Dot Image
Tutorials for this Question
  1. Tutorial # 00858379 Posted By: wildcraft Posted on: 11/05/2024 10:15 PM
    Puchased By: 2
    Tutorial Preview
    The solution of Case 2.1: Facebook’s Big Data Storage, What is Facebook's main motivation...
    Attachments
    Case_2_1_Facebook’s_Big_Data_Storage,_What_is_Facebooks_main_motivation.ZIP (18.96 KB)

Great! We have found the solution of this question!

Whatsapp Lisa