Saturday, February 25, 2012

Similarity based recommendations with Cypher

Josh Adell recently published a blog post on the similarity based recommendation engine he is building for rating and recommending beer- always a welcome service! His post shares his experience with Gremlin, a graph traversal language. I'm going to take his example to show you how it can be done using Cypher, Neo4j's query language.

So basically, we want to be able to recommend beers. But we just don't want the highly rated beers- it means much more to us if they were rated highly by people who have similar tastes to us. 
The goal is to answer two questions:
1. For a beer that I have not rated, what is the average rating given to it by people similar to me?
2. Which beers should I try and then rate? We want to recommend beers that have been rated 7 or higher by people similar to me.

To get to both these questions, we first need to determine people with similar tastes. For our purposes, Josh defines a similar user as one whos ratings on an average are within 2 points of my ratings for the same items. 
We'll use this simple graph to test our Cypher queries:

(user1) -[:RATED 3]->(itemA)
(user2) -[:RATED 2]->(itemA)
(user3) -[:RATED 7]->(itemA)

(user1) -[:RATED 8]->(itemB)
(user2) -[:RATED 7]->(itemB)
(user3) -[:RATED 4]->(itemB)

(user2) -[:RATED 5]->(itemC)
(user3) -[:RATED 9]->(itemC)

(user2) -[:RATED 8]->(itemD)
(user3) -[:RATED 4]->(itemD)

Considering user1, the user with similar tastes is user2. Let's take a look at the Cypher query:

start me = node(user1) //Look up user1 via an index or some other means
match (me)-[myRating:RATED]->(i)<-[otherRating:RATED]-(u)
where abs(myRating.rating-otherRating.rating)<=2
return u

The match clause finds all items that I rated which have also been rated by other users, represented by u.
The where clause filters out ratings that differ by more than 2 points. The result of this query would be user2.

Let's try to answer question 1: What is the average rating(by similar people) of a beer not rated by me?

start item=node(x), //Look up item via an index or other means
       similarUsers=node(u)  //similarUsers here is the result received in the first query above
match (similarUsers)-[r:RATED]->(item)
return AVG(r.rating)

And finally, question 2: Which beers(that I haven't tried) have been rated 7 or higher by users similar to me?

start me=node(user1), //Look up item via an index or other means
       similarUsers=node(3) /similarUsers here is the result received in the first query above
match (similarUsers)-[r:RATED]->(item)
where r.rating > 7 and not((me)-[:RATED]->(item)) 
return item

There you go! Of course there's much more to recommending beers- read more about it in Josh' post

-Luanne




Flavor of the month- Neo4j and Heroku, part 2

Continuing from the previous post, here are the features we used for the first version of Flavorwocky:

Set up:
When the application starts up, we check if the database is empty. To do this, we perform a traversal from the reference node looking for any categories connected to it. If there are none, then we go ahead and create the set of categories by first creating their nodes, and then a relationship to the reference node. We also create the index "ingredients", which is used to index ingredient names. Source code: https://github.com/luanne/flavorwocky/blob/master/grails-app/conf/BootStrap.groovy

Add pairing:
Adding a pairing involves
  • Checking whether either node already exists to avoid recreating it
  • Creating both nodes and linking them to their categories in a single transaction
  • Creating a relationship between them in the same transaction above
To accomplish this, we used the Batch operation (note, this is still experimental).
Source: https://github.com/luanne/flavorwocky/blob/master/grails-app/controllers/com/herokuapp/flavorwocky/FlavorwockyController.groovy (fetchOrCreateNodes())

Auto complete: This was just an index lookup matching the partially entered ingredient by name. Source: https://github.com/luanne/flavorwocky/blob/master/grails-app/controllers/com/herokuapp/flavorwocky/FlavorwockyController.groovy (autosearch())

Visualization:
We used d3.js to provide two visualizations for the search results. The "Explore" visualization is based on the Node-Link tree; we used a Cypher query to find all ingredients that pair with the searched ingredient up to 3 levels deep, transform it into the appropriate data structure, and render it as JSON. Note, although the visualization interactive, the fetching of data is not. The entire set of data for 3 levels is grabbed at once- a future enhancement would be fetching children only when you expand a node.
Source: https://github.com/luanne/flavorwocky/blob/master/grails-app/controllers/com/herokuapp/flavorwocky/FlavorwockyController.groovy (getSearchVisualizationAsTreeJson())

Although the tree is pretty slick, one shortcoming is that it is a tree- so if children are linked to each other, then you see multiple instances of that ingredient in the tree. Hence we tried another visualization to capture the interconnections between ingredients and also surface interesting facts such as flavor trios- there's a pretty high chance that if you see a triangle in the network visualization, those three ingredients can be combined together well.
We used the Force directed graph for this. Again, a Cypher query came to the rescue (this time 5 levels deep to produce a richer model).
Source: https://github.com/luanne/flavorwocky/blob/master/grails-app/controllers/com/herokuapp/flavorwocky/FlavorwockyController.groovy (getSearchVisualizationAsNetworkJson())

In both examples, the affinity of the pairing is used to calculate the length of the connector between ingredients, indicating shorter connections have ingredients that pair much better than those with longer connections.

That's about it! There's so much more than can be done with this application, but it's going to have to wait for a bit.
Again, please about Flavorwocky if you like it or just want me to win. Voting also helps!

-Luanne

Flavor of the month- Neo4j add-on for Heroku

Neo4j launched a challenge earlier this year called "Seed the Cloud" to get folks to create templates or demo applications on Heroku using the Neo4j add-on. After much internal debate, I decided to enter, only to be thrown into despair for lack of an idea. The idea came to me while I was doing nothing in particular- to build a simple app that would help one find ingredients whose flavors complement one another.

Basically, you have these ingredients that pair really well together- knowing which ingredients have flavor affinities can produce some amazing new dishes. 
The app allows you to add pairings with an 'affinity'- how well they pair together- and search for an ingredient to find others that pair with it.

The app is built using Grails 2.0 for the front end, visualizations are the result of the very neat d3.js library, and interaction with Neo4j is done using the Neo4j REST Api.
The entire thing is deployed on Heroku (with the Neo4j add-on), while the source is available on github
To get started with Grails 2.0, Neo4j add-on and Heroku, read Aldrins post on the topic: http://thought-bytes.blogspot.in/2012/02/grails-20-heroku-and-neo4j-addon.html

How to deploy Flavorwocky locally as well as on Heroku is documented in the Readme.

The model lends itself very nicely to a graph- as you can see, it is very simple:

Every category is color coded- for convenience, it is stored as a property on the category node, but of course, it doesn't have to be. 

Flavorwocky was also picked as the basis for this challenge because it is a real world use case for a graph, and I wanted this entry to really focus on Neo4j primarily rather than supporting bells and whistles. The features of the Neo4j REST api used in this app will be the topic of the next post.
Gentle reminder: Please about Flavorwocky if you like it or just want me to win. Voting also helps!

-Luanne

Grails 2.0, Heroku and Neo4j addon


This post will help you get started with Grails 2.0 on Heroku with the Neo4j addon.
This largely follows the guide at http://devcenter.heroku.com/articles/grails
Note: This does not use the Grails Neo4j plugin (http://grails.org/plugin/neo4j), since I couldn't get it working with Grails 2.0 (Will post an update as and when I can)
(Update: here's the one with the neo4j plugin http://thought-bytes.blogspot.in/2012/03/getting-started-with-grails-neo4j.html)

Create the grails app
>grails create-app heroku-neo4j
>cd heroku-neo4j

Delete the DataSource.groovy
>del grails-app\conf\DataSource.groovy

Check into Git
Create a .gitignore file for ignored files and check into git
*.iws
*Db.properties
*Db.script
.settings
.classpath
.project
eclipse
stacktrace.log
target
/plugins
/web-app/plugins
/web-app/WEB-INF/classes
web-app/WEB-INF/tld/c.tld
web-app/WEB-INF/tld/fmt.tld


>git init
>git add .
>git commit -m init

Create a Heroku app
>heroku create --stack cedar

...and deploy
>git push heroku master

Check it out in the browser
>heroku open

Add the Neo4J addon
>heroku addons:add neo4j:test

Install the Grails Rest plugin
>grails install-plugin rest

Now you can access the Neo4J datastore on Heroku via its REST interface. For example, to create new node from a controller,


import static groovyx.net.http.ContentType.JSON
    withRest(id: "neo4j", uri: "<REST-URL>") {
        auth.basic '<login> ', '<password> '
        def response = post(contentType:JSON, requestContentType:JSON, body: ['name': 'Artichoke'] )
    }
    render response.status
    render response.data


Replace <REST-URL> , <login>  and <password>  with your Neo4J instance's settings.

-Aldrin

Tuesday, February 7, 2012

Getting Maven to just run - tips, workarounds

Recently, I was trying to run an inherited project, but couldn't get maven to compile. Some local libs were not being picked up by maven. I had to add it manually to the local maven repo. So, when you have a library your project is dependent on and you cannot find it on a public maven repository, then you can add it to your local repository as below:
mvn install:install-file -Dfile=C:\my-awesome-api-1.0.jar -DgroupId=com.awesome -DartifactId=awesome-api -Dversion=1.0 -Dpackaging=jar

Then add a <dependency> to your pom.xml,
<dependency>
< groupId>com.awesome < /groupId>
< artifactId>awesome-api < /artifactId>
< version>1.0 < /version>
< /dependency>


Once the compilation went through successfully, the test started failing. If you don't particularly care about the tests, then you can skip it at any time by adding -Dmaven.test.skip=true or -DskipTests=true to the command line,
ex. > mvn:package -DskipTests=true