Posted
over 8 years
ago
by
Pat Patterson
New in StreamSets Data Collector (SDC) 2.2.0.0 is the Spark Evaluator, a processor stage that allows you to run an Apache Spark application, termed a Spark Transformer, as part of an SDC pipeline. With the Spark Evaluator, you can build a pipeline to
|
Posted
over 8 years
ago
by
Kirit Basu
And here it is folks, the last release of 2016 – StreamSets Data Collector version 2.2.0.0. We’ve put in a host of important new features and resolved 120+ bugs. We’re gearing up for a solid roadmap in 2017, enabling exciting new use cases and
|
Posted
over 8 years
ago
by
Pat Patterson
Apache Flume “is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data”. The typical use case is collecting log data and pushing it to a destination such as the Hadoop Distributed
|
Posted
almost 9 years
ago
by
Rick Bilodeau
It’s been a little over a year (9/24/15) since we launched StreamSets Data Collector as an open source project. For those of you unfamiliar with the product, it’s any-to-any big data ingestion software through which you can build and place into
|
Posted
almost 9 years
ago
by
Pat Patterson
As you likely already know, StreamSets Data Collector (SDC) is open source, made available via the Apache 2.0 license. The entire source code for the product is hosted in a GitHub project and the binaries are always available for download. As well as
|
Posted
almost 9 years
ago
by
Rick Bilodeau
Reposted from the Cloudera Vision blog. What do Sony, Target and the Democratic Party have in common? Besides being well-respected brands, they’ve all been subject to some very public and embarrassing hacks over the past 24 months. Because cybercrime
|
Posted
almost 9 years
ago
by
Pat Patterson
Back in March, I wrote a tutorial showing how to create a custom destination for StreamSets Data Collector (SDC). Since then I’ve been looking for a good sample use case for a custom processor. It’s tricky to find one, since the set of out-of-the-box
|
Posted
almost 9 years
ago
by
Kirit Basu
We’re happy to announce a new release of the Data Collector. This minor release has over 30+ bug fixes and a number of improvements and a few new features : A Package Manager that allows you to install new Stage Libraries (Origins, Processors
|
Posted
almost 9 years
ago
by
Pat Patterson
Sandish Kumar, a Solutions Engineer at phData, builds and manages solutions for phData customers. In this article, reposted from the phData blog, he explains how to generate simulated NetFlow data, read it into StreamSets Data Collector via the UDP
|
Posted
almost 9 years
ago
by
Kirit Basu
Last October, we publicly announced StreamSets Data Collector version 1.0. Over the last 12 months we have seen an awesome (a word we don’t use lightly) amount of adoption of our first product – from individual developers simplifying their day-to-day
|