motivation
- hadoop is great for unstructured data
- hadoop is not great for structured data
- how to glue data from mysql to unstructured data for hadoop
DBInputFormat
- uses jdbc to connect to db
DBWritable
- a bridge from jdbc result set to mapper value
Sqoop
- SQL-to-Hadoop
- jdbc-based interface
- auto datatype generation
- uses mapreduce to read tables from db
- imprts into hdfs and creates java file
- easy to import into hive
- serialized output is comma-separated
Like this:
Like Loading...
Related