4
I Use This!
Activity Not Available

News

Posted about 7 years ago by Pat Patterson
StreamSets solutions architect Alex Woolford is a data engineer with deep experience building robust and scalable solutions using technologies such as the StreamSets DataOps Platform, Apache Kafka, and the Cloudera and Hortonworks Hadoop ... [More] distributions. In his role at StreamSets, Alex provides our customers with expertise including architecture design, demonstration systems, prototypes, presentations, and product configurations. […] The post Getting Started with StreamSets Control Hub (videos) appeared first on Continuous Dataflows Built with StreamSets DataOps Platform. [Less]
Posted about 7 years ago by Ji Sun Kim
Introduction I am very excited to announce the new Hadoop FS Standalone origin in StreamSets Data Collector 3.2.0.0. Data Collector has long supported the Hadoop FS origin, but only in the cluster mode. The Hadoop FS (HDFS) Standalone origin does not ... [More] need MapReduce or YARN installed and can run in multithreaded mode, with each thread […] The post Synchronize HDFS Data into S3 Using the Hadoop FS Standalone Origin appeared first on Continuous Dataflows Built with StreamSets DataOps Platform. [Less]
Posted about 7 years ago by Dash Desai
Hello from your newly-appointed community champion and technical evangelist here at StreamSets! My name is Dash Desai and you will find me writing blog posts and cruising the community forums answering questions about StreamSets Data Collector as ... [More] well as learning from community members. I will also be presenting at meetups and conferences so if you […] The post Preview and Snapshot Features in StreamSets Data Collector appeared first on Continuous Dataflows Built with StreamSets DataOps Platform. [Less]
Posted about 7 years ago by Pat Patterson
Following on from last week's guest post from MapR's Ian Downard on integrating StreamSets Data Collector with MapR Persistent Application Client Container (PACC), MapR Distinguished Technologist John Omernik offers a cautionary tale on examining ... [More] your assumptions before jumping into the world of Docker. We repost John's original article here with his kind permission. Since starting at MapR […] The post Using Docker Wrong: My Journey to a Better Container appeared first on Continuous Dataflows Built with StreamSets DataOps Platform. [Less]
Posted about 7 years ago by Pat Patterson
Today's guest blogger is Ian Downard, a Senior Developer Evangelist at MapR Technologies. Ian focuses on machine learning and data engineering, and recently documented how he brought together the MapR Persistent Application Client Container (PACC) ... [More] with StreamSets Data Collector and Docker to build pipelines for ingesting data into the MapR Converged Data Platform. We're reposting Ian's article here, with his […] The post Using StreamSets and MapR Together in Docker appeared first on Continuous Dataflows Built with StreamSets DataOps Platform. [Less]
Posted about 7 years ago by Pat Patterson
Kinetica, just one of dozens of origins and destinations supported by StreamSets Data Collector, is a distributed, in-memory, GPU database designed for geospatial analysis, machine learning, predictive analytics, and other workloads requiring high ... [More] performance parallel processing. Mathew Hawkins, a Principal Solutions Architect at Kinetica, recently wrote an excellent tutorial on integrating Data Collector with Kinetics. We repost it here with […] The post Streaming Extreme Data Made Simple with Kinetica and StreamSets appeared first on Continuous Dataflows Built with StreamSets DataOps Platform. [Less]
Posted about 7 years ago by Pat Patterson
 Angel Alvarado is a senior software engineer at One Degree, a San Francisco-based non-profit, and also helps run the Molanco data engineering community. Angel previously contributed a Fun Example of Streaming Data into Minecraft; this time he get ... [More] serious with the Google Analytics API. Many thanks to Angel for his kind permission to adapt this article from his original. Back […] The post Extract Data from Google Analytics using StreamSets Data Collector appeared first on Continuous Dataflows Built with StreamSets DataOps Platform. [Less]
Posted about 7 years ago by Pat Patterson
RingCentral is an award-winning global provider of cloud-unified communications and collaboration solutions. RingCentral solutions empower today’s mobile and distributed workforces to be connected anywhere and on any device through voice, video, team ... [More] messaging, collaboration, SMS, conferencing, online meetings, contact center, and fax. RingCentral provides an open platform that integrates with today’s leading business apps while […] The post RingCentral Scales Out Big Data Streaming with StreamSets appeared first on Continuous Dataflows Built with StreamSets DataOps Platform. [Less]
Posted about 7 years ago by Pat Patterson
Today's guest post is by Franck Pachot, an Oracle Consultant at dbi services in Switzerland. Franck has over 20 years of experience in Oracle, covering every aspect of the database from architecture and data modeling to tuning and operation. Franck ... [More] recently documented his experiences testing StreamSets Data Collector‘s Oracle CDC origin, and kindly allowed us to repost his blog […] The post Change Data Capture from Oracle with StreamSets Data Collector appeared first on Continuous Dataflows Built with StreamSets DataOps Platform. [Less]
Posted about 7 years ago by Pat Patterson
Nikolay Petrachkov (Nik for short) is a BI developer in Amsterdam by day, but in his spare time, he combines his passion for games and data engineering by building a project to analyze game-streaming data from Twitch. Nik discovered StreamSets Data ... [More] Collector when he was looking for a way to build data pipelines to deliver insights […] The post Ingest Game-Streaming Data from the Twitch API appeared first on Continuous Dataflows Built with StreamSets DataOps Platform. [Less]