Software engineering notes

silicon valley hadoop user group 5-20-09: ibm research on hadoop over gpfs

leave a comment »

- tested on jbot
- equivalent performance between hdfs and gpfs for non-trivial applications
- used Bonnie for filesys benchmarking
- cluster topology
-- standard hadoop uses local storage
--- cheap, scalable
-- full san uses central store
--- configurability of compute nodes
--- not as scalable
-- sub-cluster uses split storage
- conclusions
-- abstraction of filesys from mapreduce was good
-- gpfs (and other cluster filesys) can match performance of hdfs
- scalability?
-- gpfs runs on thousands of nodes
- fault tolerance?
-- not tested yet
- how similar is gpfs to unix filesys?
-- consistency issues are handled in a similar way

Written by Erik

May 20, 2009 at 6:00 pm

Posted in notes

Tagged with ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: