Created: 2012-03-23 06:09
Updated: 2019-01-26 18:54


The original goal of visitante was to calculate various web analytic metric as defined by Avinash Kaushik ( on the Hadoop, Spark and Storm platform. However, it has evolved into a general purpose log analytic and mining solution, beyond web server logs.

It also includes customer or marketing analytic solution. Since customer behavior data is mostly captured in logs, there is a close relationship between customer analytics and log analytics.


  • Simple and easy to use batch and real time web analytic
  • Highly configurable


The following blogs of mine are good source of details of visitante


  • Hadoop based batch analytic for

    • Num of pages visited
    • Total time spent
    • Last page visited
    • Flow status (e.g., whether checkout flow was entered, entered but not completed or completed)
    • Incident detection
    • Pattern based event detection with context
    • Customer life time value
  • Storm based real time analytic for

    • Bounce rate
    • Visit depth distribution


For Hadoop 1

  • mvn clean install

For Hadoop 2 (non yarn)

  • git checkout nuovo
  • mvn clean install

For Hadoop 2 (yarn)

  • git checkout nuovo
  • mvn clean install -P yarn

For spark

  • Build chombo first in master branch with
    • mvn clean install
    • sbt publishLocal
  • Build chombo-spark in chombo/spark directory
    • sbt clean package

Need help?

Please feel free to email me at


Contributors are welcome. Please email me at

Cookies help us deliver our services. By using our services, you agree to our use of cookies Learn more