AllBestEssays.com - All Best Essays, Term Papers and Book Report
Search

Eliminating the Middleman: Peer-To-Peer Dataflow

Essay by   •  March 7, 2012  •  Research Paper  •  6,035 Words (25 Pages)  •  1,677 Views

Essay Preview: Eliminating the Middleman: Peer-To-Peer Dataflow

Report this essay
Page 1 of 25

Eliminating The Middleman: Peer-to-Peer Dataflow

Adam Barker

National e-Science Centre

University of Edinburgh

a.d.barker@ed.ac.uk

Jon B. Weissman

University of Minnesota,

Minneapolis, MN, USA.

jon@cs.umn.edu

Jano van Hemert

National e-Science Centre

University of Edinburgh

j.vanhemert@ed.ac.uk

ABSTRACT

Efficiently executing large-scale, data-intensive workflows such

as Montage must take into account the volume and pattern

of communication. When orchestrating data-centric workflows,

centralised servers common to standard workflow systems

can become a bottleneck to performance. However,

standards-based workflow systems that rely on centralisation,

e.g., Web service based frameworks, have many other

benefits such as a wide user base and sustained support.

This paper presents and evaluates a light-weight hybrid

architecture which maintains the robustness and simplicity

of centralised orchestration, but facilitates choreography by

allowing services to exchange data directly with one another.

Furthermore our architecture is standards compliment, flexible

and is a non-disruptive solution; service definitions do

not have to be altered prior to enactment. Our architecture

could be realised within any existing workflow framework,

in this paper, we focus on a Web service based framework.

Taking inspiration from Montage, a number of common

workflow patterns (sequence, fan-in and fan-out), input to

output data size relationships and network configurations

are identified and evaluated. The performance analysis concludes

that a substantial reduction in communication overhead

results in a 2-4 fold performance benefit across all patterns.

An end-to-end pattern through the Montage workflow

results in an 8 fold performance benefit and demonstrates

how the advantage of using our hybrid architecture increases

as the complexity of a workflow grows.

Categories and Subject Descriptors

C.2.4 [Computer-Communication Networks]: Distributed

Systems; C.4 [Performance of Systems]; D.2.11 [Software

Engineering]: Software Architectures

General Terms

Design, Performance.

Keywords

Decentralised orchestration, workflow optimisation.

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for profit or commercial advantage and that copies

bear this notice and the full citation on the first page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior specific

permission and/or a fee.

HPDC'08, June 23-27, 2008, Boston, Massachusetts, USA.

Copyright 2008 ACM 978-1-59593-997-5/08/06 ...$5.00.

1. INTRODUCTION

Efficiently executing large-scale, data-intensive workflows

common to scientific applications must take into account

the volume and pattern of communication. For example, in

Montage [7] an all-sky mosaic computation can require between

2-8 TB of data movement. Standard workflow tools

based on a centralised enactment engine, such as Taverna

[19] and OMII BPEL Designer [18] can easily become a performance

bottleneck for such applications, extra copies of

the data (intermediate data) are sent that consume network

bandwidth and overwhelm the central engine. Instead, a solution

is desired that permits data output from one stage to

be forwarded directly to where it is needed at the next stage

in the workflow. It is certainly possible to develop an optimised

workflow system from scratch that implements this

kind of optimisation. In contrast workflow systems based on

concrete industrial standards offer a different set of benefits:

they have a much larger and wider user base, which allows

the leverage of a greater availability of supported tools and

application components. This paper explores the extent to

which the benefits of each approach can be realised. Can

a standards-based workflow system achieve the performance

optimisations of custom systems and what are the tradeoffs?

1.1 Orchestration and Choreography

There are two common architectural approaches to implementing

...

...

Download as:   txt (44.9 Kb)   pdf (435.2 Kb)   docx (36 Kb)  
Continue for 24 more pages »
Only available on AllBestEssays.com