Online Safety Community

In order to understand the goals of MapReduce, it is important to realize for which scenarios MapReduce is optimized. The MapReduce programming model is created for processing data which requires “DATA PARALLELISM”, the ability to compute multiple independent operations in any order (King). In parallel processing, commutative operations are operations where the order of execution does not matter to the results of the equation. Commutativity can apply to complex operations and even processes, as long as they don’t manipulate the same memory. For example, in the figure below, as long as foo(a) and bar(b) don’t manipulate the same variable, they can occur in parallel in different threads. However, the write operation must wait for both foo() and bar() to complete. The figure below illustrates a dependency graph between foo(a), bar(a) and the write command.

Figure 1 – Parallelism Dependency Graph

One of the goals of parallelism is identifying the logical “tasks” or units which can be run in parallel as threads. Parallel programming techniques require developers to implement dependency graphs, which can become much more as the amount of shared information and sequence of operations increases. Techniques such as locks and barriers, critical sections, semaphores, monitors, RPC and rendezvous have been proposed to aid in the design of multi threaded and distributed. In Parallel and Distributed processing, intelligent task design attempts to eliminate as many synchronization points as possible, but some will still be required. Patterns such as “Master/Worker” and “Producer/Consumer” are different patterns that developers can use to implement parallel thread processing.

MapReduce provides a programming model which abstracts many of the aforementioned complexities of parallel processing from the software engineer. The MapReduce implementation performs much of the “wiring” associated with parallel processing, leaving the developer to implement relatively simple methods. The use of MapReduce does come with some constraints, making it less appropriate for some tasks. MapReduce models are optimized for tasks where a large number of key*value input lists must be processed somewhat independently. MapReduce map() method must be commutative, in order for the MapReduce implementation to make use of parallelization. MapReduce enables the parallelization across hundreds and even thousands of CPU’s.

Views: 20

Reply to This

Take our poll!

Take our poll!

Latest Activity

Tom Clark posted a blog post

5 Reasons Why Teens Get Addicted to Alcohol and Drugs

Teenage can be considered as the most difficult period of life. During the teenage years, boys and girls become so desperate that they can do anything they wish. It is basically a vulnerable time when teens try to navigate the bridge between adulthood and childhood. Teens are the most rebellious and as per the study, this is the reason why they get involved in anti-social activities. Not only that, they often become addicted to drug and alcohol because of their rebellious nature.Now this is not…See More
15 minutes ago
Jam Blanco posted a blog post

Response to Marine Oil Spills

Oil spills can wreak havoc on the environment and cause irreversible damage if they aren’t controlled in a timely manner. However, emergency responders need to be trained to react to emergencies quickly and efficiently to prevent more damage. The type of training they receive should depend on their proximity to the spill and whether they need to stop, contain or recover oil from release.For instance, workers who are assigned as early responders to an oil spill should be given more training…See More
24 minutes ago
Adam Fleaming posted a blog post

Where does GMP Training end and HR training begin?

That pharmaceutical companies need to hire professionals with the requisite qualifications is beyond question. This is not only required for the smooth conduct of activities in their course of their day-to-day work, but also because the FDA has set out its requirements for the proper educational and skill set qualification of employees in this profession in 21 CFR 211.25(a).This FDA section underlines the need for educational qualifications, training and experience to carry out their job…See More
2 hours ago
Training Doyens posted events
7 hours ago

Forum

Python Condition Objects Tutorial in 2018

If you have knowledge of other programming languages, then you would know the importance of conditional statements. Conditional statements are required for taking decisions. Whenever we operate the…Continue

Tags: course, certification, training, languages, programming

Started by Elena Lauren Apr 2.

Automation Anywhere. How do I pick a value from dropdown 1 Reply

Automation Anywhere. How do I pick a value from dropdown. I tried 'set text' from a copied variable. Its very slow, and also doesnt…Continue

Tags: anywhere, automation

Started by emmablisa. Last reply by venkatesh Mar 29.

Agile overcome common software security challenges

Paradoxically, security is a negative goal. To secure something, you must understand how insecure it is. Start by trying to break it or by figuring out how other people might break it. The same is…Continue

Tags: agile, scrum, security

Started by nicolewells Mar 23.

Understanding Data Parallelism in MapReduce

In order to understand the goals of MapReduce, it is important to realize for which scenarios MapReduce is optimized. The MapReduce programming model is created for processing data which requires…Continue

Tags: program, Implementation, Mapreduce

Started by gracylayla Mar 14.

TensorFlow serving vs TensorFlow service

I have a question regarding the difference between TensorFlow Serving versus TensorFlow service. (Sorry that I'm not familiar with this at all.)I found TensorFlow serving's definition, which is "…Continue

Tags: training, online, tensorflow

Started by emmablisa Feb 27.

Badge

Loading…

© 2018   Created by Safety Community.   Powered by

Badges  |  Report an Issue  |  Terms of Service