4
I Use This!
Activity Not Available

News

Posted over 8 years ago by Pat Patterson
One of the great things about StreamSets Data Collector is that its record-oriented architecture allows great flexibility in creating data pipelines – you can plug together pretty much any combination of origins, processors and destinations to build ... [More] a data flow. After I wrote the Ingesting Local Data into Azure Data Lake Store tutorial, it occurred to me The post Drift Synchronization with StreamSets Data Collector and Azure Data Lake appeared first on StreamSets. [Less]
Posted over 8 years ago by Rupal Shah
MapR-DB is an enterprise-grade, high performance, NoSQL database management system. As a multi-model NoSQL database, it supports both JSON document models and wide column data models. MapR-DB stores JSON documents in tables; documents within a table ... [More] in MapR-DB can have different structures. StreamSets Data Collector enables working with MapR-DB documents with its powerful schema-on-read and The post Read and Write JSON to MapR DB with StreamSets Data Collector appeared first on StreamSets. [Less]
Posted over 8 years ago by Kirit Basu
We are happy to announce the newest version of StreamSets Data Collector is available for download. This short release has over 25 new features and improvements and over 50 bug fixes. This is an enterprise-focused release that addresses the needs of ... [More] some of the world's largest organizations using StreamSets. Below is a short list of what's new, please check The post Announcing StreamSets Data Collector ver 2.4.0.0 appeared first on StreamSets. [Less]
Posted over 8 years ago by Pat Patterson
The Spark Evaluator, introduced in StreamSets Data Collector (SDC) version 2.2.0.0, lets you run an Apache Spark application, termed a Spark Transformer, as part of an SDC pipeline. Back in December, we released a tutorial walking you through the ... [More] process of building a Transformer in Java. Since then, Maurin Lenglart, of Cuberon Labs, has contributed skeleton code for The post Running Scala Code in StreamSets Data Collector appeared first on StreamSets. [Less]
Posted over 8 years ago by Pat Patterson
Azure Data Lake Store (ADLS) is Microsoft's cloud repository for big data analytic workloads, designed to capture data for operational and exploratory analytics. StreamSets Data Collector (SDC) version 2.3.0.0 included an Azure Data Lake Store ... [More] destination, so you can create pipelines to read data from any supported data source and write it to ADLS. Since configuring the ADLS The post Ingest Data into Azure Data Lake Store with StreamSets Data Collector appeared first on StreamSets. [Less]
Posted over 8 years ago by Pat Patterson
StreamSets Data Collector has long supported both reading and writing data from and to relational databases via Java Database Connectivity (JDBC). While it was straightforward to configure pipelines to read data from individual tables, ingesting ... [More] records from an entire database was cumbersome, requiring a pipeline per table. StreamSets Data Collector (SDC) 2.3.0.0 introduces the JDBC Multitable The post Replicating Relational Databases with StreamSets Data Collector appeared first on StreamSets. [Less]
Posted over 8 years ago by Kirit Basu
We’re excited to release the next version of the StreamSets Data Collector. This release has 80+ new features and improvements, and 150+ bug fixes. Multithreaded Pipelines We’ve updated the SDC framework to allow individual pipelines to scale up on a ... [More] single machine. This functionality is origin dependent. To start, we’ve designed a new HTTP Server origin The post Announcing Data Collector ver 2.3.0.0 appeared first on StreamSets. [Less]
Posted over 8 years ago by Kirit Basu
We're excited to release the next version of the StreamSets Data Collector. This release has 80+ new features and improvements, and 150+ bug fixes. Multithreaded Pipelines We’ve updated the SDC framework to allow individual pipelines to scale up on a ... [More] single machine. This functionality is origin dependent. To start, we’ve designed a new HTTP Server origin The post Announcing Data Collector ver 2.3.0.0 appeared first on StreamSets. [Less]
Posted over 8 years ago by Pat Patterson
Nick Cadenhead, a Senior Consultant at 9th BIT Consulting in Johannesburg, South Africa, uses Couchbase Server to power analytics solutions for his clients. In this blog entry, reposted from his article at LinkedIn, Nick explains why he selected ... [More] StreamSets Data Collector for data ingest, and how he extended it with a custom destination to write data to Couchbase. For some time, The post Ingesting data into Couchbase using StreamSets Data Collector appeared first on StreamSets. [Less]
Posted over 8 years ago by Pat Patterson
Splunk indexes and correlates log and machine data, providing a rich set of search, analysis and visualization capabilities. In this blog post, I'll explain how to efficiently send high volumes of data to Splunk's HTTP Event Collector via the ... [More] StreamSets Data Collector Jython Evaluator. I'll present a Jython script with which you'll be able to build The post Ingest Data into Splunk with StreamSets Data Collector appeared first on StreamSets. [Less]