December 13, 2024
Writing Data to Amazon Kinesis Data Streams
- A producer is an application that writes data to Amazon Kinesis Data Streams
- An Amazon Kinesis Data Streams producer is an application that puts user data records into a Kineses data stream
- The Kinesis Producer Library simplifies producer application development
- allows developers to achieve high write throughput to a Kinesis data stream
Benefits of using Kinesis Data Streams
- Common use is the real-time aggregation of data, followed by loading the data into a data warehouse or map-reduce cluster
- Data on Kinesis data streams are ensured of durability and scalability
- Data can be retrieved less than 1 second after it's put on the stream
- Multiple Kinesis Data Streams applications can consume data from a stream
- archiving and processing can take place concurrently/independently
- Kinesis Client Library allows fault-tolerant consumption of stream data
- provides scaling support for Kinesis Data Streams applications
Creating and Updating Data Streams
- Amazon Kinesis Data Streams ingests data, stores the data and makes it available for consumption
- Unit of data stored by Kinesis data streams is a data record
- A group of data records is a data stream
- the data records in a data stream are distributed into shards
Data Shards
- A shard has a sequence of data records in a stream
- When you create a stream, you specify the number of shards for the stream
- The Total Capacity of a stream is the sum of the capacities of its shards
- the number of shards in the stream can be increased or decreased
- you are charged on a per-shard basis
- the number of shards in the stream can be increased or decreased
- A producer puts data records into shards
- A consumer gets data records from shards
Re-sharding Errors with Kinesis Data Streams
- When sharding some streams, an extra shard can be left after the operation finishes
- If an even number of shards was requested, the number of open shards became odd
- This occurs when the width of a shard is very small in relation to other shards in the stream
- Resolve by merging the extra shard with any adjacent shard
- This issue occurs when the difference between StartingHashKey and EndingHashKey is very small, like '1'
- normally the difference between StartingHashKey and EndingHashKey is large
- This can normally be resolved by finding the ShardID with the next adjacent Hash Key value, and merging the small shard into that shard
Loading comments...