Software engineering notes

silicon valley hadoop user group 5-20-09: cloudera on automatic database import w/ sqoop

leave a comment »

- hadoop is great for unstructured data
- hadoop is not great for structured data
- how to glue data from mysql to unstructured data for hadoop

- uses jdbc to connect to db

- a bridge from jdbc result set to mapper value

- SQL-to-Hadoop
- jdbc-based interface
- auto datatype generation
- uses mapreduce to read tables from db
- imprts into hdfs and creates java file
- easy to import into hive
- serialized output is comma-separated

Written by Erik

May 20, 2009 at 6:16 pm

Posted in notes

Tagged with , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: