Trickle-Feeding Log Data into the HBase NoSQL Database using Flume

Wednesday, May 21st, 2014 by

The other day I posted an article on the blog around using Flume to transport Apache web log entries from our website into Hadoop, with the final destination for the entries being an HDFS file – with the HDFS file essentially mirroring the contents of the webserver log file. Once you’ve set this transport mechanism […]

Trickle-Feeding Log Files to HDFS using Apache Flume

Sunday, May 18th, 2014 by

In some previous articles on the blog I’ve analysed Apache webserver log files sitting on a Hadoop cluster using Hive, Pig and most recently, Apache Spark. In all cases the log files have already been sitting on the Hadoop cluster, SFTP’d to my local workstation and then uploaded to HDFS, the Hadoop distributed filesystem, using […]

Website Design & Build: tymedia.co.uk