notes: Bayjax Meetup @ Yahoo! Sunnyvale (7/27): Satyen Desai on YUI3 architecture

Satyen Desai describing architecture of YUI3
Satyen Desai describing architecture of YUI3

satyen desai talking about YUI 3 arch concepts & lessons learned

– mtivation
— yui2 is mature, why change it?
— lighter 
— allow fine-grained include control
— rethink the way we use code: move away from traditional inheritance model towards js augmentation & mixins
— make it easier
— yui2 has four different widget api classes; yui3 has a single, standard api
— make common actions easier
—- iteration
—- chaining
— runtime performance, ie make it faster
— yui2 has always used a good namespace
— yui3 takes namespacing further by giving you instance-level control
-examples
— self-populating
— yui3 pulls down dependencies in an optimized way
—- no more file-order concerns
— yui3 offers protection
— each instance is sandboxed and pulls in its dependencies indeendant of other instances
— self-populating
— naturally creates anonymous function wrappers
— code re-use
— yui3 avoids the kitchen sink by breaking libs into sub-modules and allowing the developer to only load the submodules required
– plugins and examples
— in yui2, all instances of a class contain the kitchen sink
— in yui3, we can use and extend at the sub-module-level
– events
— built from decoupled code
— event facades wrap events in a consistent, normalized interface
— facedes wrap custom events as well
— on and after events are built into the event publisher
— bubbling
— yui3 affords more control over the event stack
— detaching listeners
– node facade
— a single location for wrking w/ anything html related
— enhances and normalizes
— yui3 build utils into the facade as opposed to yui2’s library-based orientation
— extendable
— we can attach plug-ins to a node, eg an io object
— iterationa and batch operations are suported
– core lang convneineces
— isType methods
– questions
— cross domain?
— managed via flash object
— multiple versions?
— the last version loaded is the current version available
— what does yui3 do better than other libs?
— yui3 excels in readability and maintainablility
— can yui3 be used on top of yui2?
— currently, you can use both on a page, but not necessarily build one on the other
– this talk is available online

 

Satyen Desai describing YUI3 architecture
Satyen Desai describing YUI3 architecture

 

 

meetup: http://www.meetup.com/BayJax/calendar/10852424/

Satyen Desai talking about YUI 3 arch concepts & lessons learned

– motivation

— yui2 is mature, why change it?

— lighter 

— allow fine-grained include control

— rethink the way we use code: move away from traditional inheritance model towards js augmentation & mixins

— make it easier

— yui2 has four different widget api classes; yui3 has a single, standard api

— make common actions easier

—- iteration

—- chaining

— runtime performance, ie make it faster

— yui2 has always used a good namespace

— yui3 takes namespacing further by giving you instance-level control

-examples

— self-populating

— yui3 pulls down dependencies in an optimized way

—- no more file-order concerns

— yui3 offers protection

— each instance is sandboxed and pulls in its dependencies indeendant of other instances

— self-populating

— naturally creates anonymous function wrappers

— code re-use

— yui3 avoids the kitchen sink by breaking libs into sub-modules and allowing the developer to only load the submodules required

– plugins and examples

— in yui2, all instances of a class contain the kitchen sink

— in yui3, we can use and extend at the sub-module-level

– events

— built from decoupled code

— event facades wrap events in a consistent, normalized interface

— facedes wrap custom events as well

— on and after events are built into the event publisher

— bubbling

— yui3 affords more control over the event stack

— detaching listeners

– node facade

— a single location for wrking w/ anything html related

— enhances and normalizes

— yui3 build utils into the facade as opposed to yui2’s library-based orientation

— extendable

— we can attach plug-ins to a node, eg an io object

— iterationa and batch operations are suported

– core lang convneineces

— isType methods

– questions

— cross domain?

— managed via flash object

— multiple versions?

— the last version loaded is the current version available

— what does yui3 do better than other libs?

— yui3 excels in readability and maintainablility

— can yui3 be used on top of yui2?

— currently, you can use both on a page, but not necessarily build one on the other

– this talk is available online: http://developer.yahoo.com/yui/theater/

hadoop-scale

“Hundreds of gigabytes of data constitute the low end of Hadoop-scale. Actually Hadoop is built to process “web-scale” data on the order of hundreds of gigabytes to terabytes or petabytes. At this scale, it is likely that the input data set will not even fit on a single computer’s hard drive, much less in memory. So Hadoop includes a distributed file system which breaks up input data and sends fractions of the original data to several machines in your cluster to hold.”

http://developer.yahoo.com/hadoop/tutorial/module1.html

hadoop summit 09 > applications track > lightning talks

emi
– hadoop is for performance, not speed
– use activerecord or hibernate for rapid, iterative web dev
– few businesses write map reduce jobs –> use cascading instead
– emi is a ruby shop
– I2P
— feed + pipe script + processing node
— written in a ruby dsl
— can run on a single node or in a cluster
— all data is pushed into S3, which is great cause it’s super cheap
— stack: aws > ec2 + s3 > conductor + processing node + processing center > spring + hadoop > admin + cascading > ruby-based dsl > zookeeper > jms > rest
— deployment via chef
— simple ui (built by engineers, no designer involved)
– cascading supports dsls
– “i helpig ciomputers learn languages
– higher accuracy can be achieved using a dependency syntax tree, but this is expensive to produce
– the expectation-maximum algorithm is a cheaper alternative
– easy to parallelize, but not a natural fit for map-reduce
— map-reduce overhead can become a bottleneck
– 15x speed-up using hadoop on 50 processors
– allowing 5% of data to be dropped results in a 22x speed-up w/ no loss in accuracy
– a more complex algorithm, not more data, resulted in better accuracy
– bayesian estimation w/ bilingual pairs, a more complex algo, with 8000 only sentences results in 62% accuracy (after a week of calculation!)

hadoop summit 09 > applications track > Case Studies on EC2

ref: http://developer.yahoo.com/events/hadoopsummit09/

– eHarmony

— matching people is an N^2 process

— run hadoop jobs on EC2 and S3

— results downloaded from S3 and imported into BerkeleyDB

— S3 is a great place to store huge files for a long time because it’s so cheap

— switched from bash to ruby because ruby has better exception handling

— elastic map reduce has replaced 150 lines of ec2 management script

 

– share this

— simplifies sharing online content: delicious + ping.fm + bit.ly

— they’re a small compan, but they need to keep pace w/ the volume of the large publishers they support

— they’re 100% based on AWS

— aster + lamp stack + cascading running hadoop (to clean logs before pushing data into db) + s3 + sqs

— sharded search mostly used for business intel

— cascading allows efficient hadoop coding, more so than pig

— in the hadoop book, the author of cascading wrote a case study on sharethis

 

– lookery

— started as an ad network on facebook

— built completely on aws

— use a javascript-based tracker like google analytics to gather data

— data acquisition + data serving + reporting + billing–> all done in hadoop

— they use voldemort, a distributed key/val store instead of memcache

— heavy use of hadoop streaming w/ python

 

– deepdyve

— a search engine

— having an elastic infrastructure allows for innovation

— using hadoop, they went from 1 wk to 1 hr for indexing

— start spinning up new clusters and discarding old ones

— ec2 + katta + zookeeper + hadoop + lucene –>most of the software they run, they didn’t have to write

— query times are lower, user satisfaction is higher

— problems:

— unstable aws

— session timeout on zookeeper

— slow provisioning for aws

— with aws, they can run load tests to prepare for spikes

App Engine Y!AP app that pushes updates via OpenSocial JS API

Usage:

  1. create App Engine app
  2. edit main.py to look like the code below and deploy
  3. create YAP app
  4. set app base url to yourappname.appspot.com/example
  5. preview your app
import wsgiref.handlers
from google.appengine.ext import webapp

class ExampleHandler(webapp.RequestHandler):
	def post(self):
		html = """
		
		//ref: http://developer.yahoo.com/yap/guide/opensocial-examples.html
		var postActivity = function(title, body) {
				var params = {};
				params[opensocial.Activity.Field.TITLE] = title;
				params[opensocial.Activity.Field.BODY] = body;
				var activity = opensocial.newActivity(params);
				opensocial.requestCreateActivity(
					activity,
					opensocial.CreateActivityPriority.LOW,
					function(){});
			},
			handleResponse = function(response){

				var viewer = response.get('viewer').getData(),
					name = viewer.getDisplayName();

				postActivity(
					name + ' posted an update ...',
					'... using OpenSocial!'
				);
			},
			getViewerData = function() {
				var req = opensocial.newDataRequest();
			  	req.add(req.newFetchPersonRequest("VIEWER"), "viewer");
				req.send(handleResponse);
			};

		//this is the bare minimum code to push updates
		var params = {};
		params[opensocial.Activity.Field.TITLE] = 'title';
		params[opensocial.Activity.Field.BODY] = 'body';
		var activity = opensocial.newActivity(params);
		opensocial.requestCreateActivity(
			activity,
			opensocial.CreateActivityPriority.LOW,
			function(){});

		//this is a slightly enhanced update flow
		getViewerData();

		
		"""
		self.response.headers['Content-Type'] = 'text/html'
		self.response.out.write(html)

application = webapp.WSGIApplication(
	[('/example', ExampleHandler)],
	debug=True)

def main():
	wsgiref.handlers.CGIHandler().run(application)

if __name__ == '__main__':
 main()

previewing Y! App. Platform small view content

<!--
usage:
- upload this file to a server
- create yap app
- set app url to the location of this file and preview
- click the 'load' link to load 'preview' div w/ small view code
-->
<div style="margin:20px;">
<div id="preview" style="border:1px dashed #bbbbbb;width:300px;margin-bottom:20px;">preview</div>
load</div>
load

building an RSS consolidator w/ Yahoo! Pipes

Goal:

Consolidate the feeds from my Flickr, WordPress, Del.icio.us, etc accounts

Solution:

Use Yahoo! Pipes

Intended Audience:

Pipes newbies

Steps:

Create a pipe

  1. Go to http://pipes.yahoo.com and log in, or create an account if you don’t already have one
  2. Click the “Create a pipe” button
  3. In the editor, click and drag a “Fetch Feed” object onto the stage
  4. Open a new browser tab or window, head over to http://flickr.com/, and log in, or create an account
  5. Click on the “Your photostream” link to view your photostream
  6. Scroll to the bottom of the page and copy the “Latest” RSS feed URL
  7. Switch back to the Pipes editor and paste the Flickr URL into the text field in the “Fetch Feed” object on the stage
  8. On the bottom on the “Fetch Feed” object, there is a circular attachment point.  Click on it an drag to start forming a new pipe
  9. Drag the pipe to the attachment point on top of the “Pipe Output” object at the bottom of the stage.
  10. Release the pipe when the “Pipe Output” attachment point glows.
  11. Click the “Save” button to name and save your new pipe
  12. Click the “Back to my Pipes” link next to the save button
  13. Click on the name of your pipe in the list of your pipes.  This will display the output of your pipe.
  14. Click the “More Options” link and then “Get as RSS” in the drop-down.  This will open the RSS feed in your browser if you’re using Safari or Firefox.
Build page
  1. Following an example from Rasmus Lerdorf’s Open Hack talk (http://talks.php.net/show/hack08/8), use simplexml to parse the RSS feed:
&lt;?php
$url = &#039;';
//e.g. http://pipes.yahoo.com/pipes/pipe.run?_id=gIBkHK_V3RGpmv5oBRNMsA&amp;_render=rss
$xml = simplexml_load_file($url);
$num_items = count($xml-&gt;channel-&gt;item);
for($i = 0; $i channel-&gt;item[$i]-&gt;title;
    $date = $xml-&gt;channel-&gt;item[$i]-&gt;pubDate;
    $desc = $xml-&gt;channel-&gt;item[$i]-&gt;description;
    $link = $xml-&gt;channel-&gt;item[$i]-&gt;link;

   echo "
<div>";
   echo "
<h3><a href='$link'>$title</a></h3>
";
   echo "$date";
   echo "$desc";
   echo "</div>
";
}
  1. Go back to the browser tab displaying the output from your pipe and copy the URL
  2. Paste this URL into your index.php file as the value for the variable “$url”
  3. Change the beginning of the URL from ‘feed://…&#8217; to ‘http://…&#8217;
  4. Load your page and you should see the output defined above.  Note: var_dump the ‘$xml’ object to see the other available fields.
Extend pipe
  1. Back in the Pipes editor, drag another “Fetch Feed” object onto the stage
  2. Copy and paste another RSS feed into it
  3. From under the “Operators” heading in the list on the left, drag out a “Union” object
  4. Run the pipes from the two “Fetch Feed” objects into the union and pipe the output to the “Pipe Output” object
  5. To add more feeds, drag out additional “Fetch Feed” objects and connect them to the union