java


using kafka-streams to conditionally sort a json input stream


I am new to developing kafka-streams applications. My stream processor is meant to sort json messages based on a value of a user key in the input json message.
Message 1: {"UserID": "1", "Score":"123", "meta":"qwert"}
Message 2: {"UserID": "5", "Score":"780", "meta":"mnbvs"}
Message 3: {"UserID": "2", "Score":"0", "meta":"fghjk"}
I have read here Dynamically connecting a Kafka input stream to multiple output streams that there is no dynamic solution.
In my use-case I know the user keys and output topics that I need to sort the input stream. So I am writing separate processor applications specific to each user where each processor application matches a different UserID.
All the different stream processor applications read from the same json input topic in kafka but each one only writes the message to a output topic for a specific user if the preset user condition is met.
public class SwitchStream extends AbstractProcessor<String, String> {
#Override
public void process(String key, String value) {
HashMap<String, String> message = new HashMap<>();
ObjectMapper mapper = new ObjectMapper();
try {
message = mapper.readValue(value, HashMap.class);
} catch (IOException e){}
// User condition UserID = 1
if(message.get("UserID").equals("1")) {
context().forward(key, value);
context().commit();
}
}
public static void main(String[] args) throws Exception {
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "sort-stream-processor");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(StreamsConfig.KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
TopologyBuilder builder = new TopologyBuilder();
builder.addSource("Source", "INPUT_TOPIC");
builder.addProcessor("Process", SwitchStream::new, "Source");
builder.addSink("Sink", "OUTPUT_TOPIC", "Process");
KafkaStreams streams = new KafkaStreams(builder, props);
streams.start();
}
}
Question 1:
Is it possible to achieve the same functionality easily using the High-Level Streams DSL instead if the Low-Level Processor API? (I admit I found it harder understand and follow the other online examples of the High-Level Streams DSL)
Question 2:
The input json topic is getting input at a high rate 20K-25K EPS. My processor applications don't seem to be able to keep pace with this input stream. I have tried deploying multiple instances of each process but the results are nowhere close to where I want them to be. Ideally each processor instance should be able to process 3-5K EPS.
Is there a way to improve my processor logic or write the same processor logic using the high level streams DSL? would that make a difference?
You can do this in high-level DSL via filter() (you effectively implemented a filter as you only return a message if it's userID==1). You could generalize this filter pattern, by using KStream#branch() (see the docs for further details: http://docs.confluent.io/current/streams/developer-guide.html#stateless-transformations). Also read the JavaDocs: http://kafka.apache.org/0102/javadoc/index.html?org/apache/kafka/streams
KStreamBuilder builder = new KStreamBuilder();
builder.stream("INPUT_TOPIC")
.filter(new Predicate() {
#Overwrite
boolean test(String key, String value) {
// put you processor logic here
return message.get("UserID").equals("1")
}
})
.to("OUTPUT_TOPIC");
About performance. A single instance should be able to process 10K+ records. It's hard to tell without any further information what the problem might be. I would recommend to ask at Kafka user list (see http://kafka.apache.org/contact)

Related Links

XORing two doubles in Java
Hard coding a two dimensional array error?
In Java, how to make a method able to accept variadic parameters or Set of parameters without duplicating the code of implementation?
How can I escape this string properly?
Java Regular Expression to find multiple lines of a specific length
Is it possible to “explore” which objects are defined within an other object via reflection at runtime?
connect a point to each two closest point between different points
Java: Converting ints & doubles to floats?
Runtime.exec an app packaged in same jar (in Win)?
File - Quit no longer working in Java GUI
PDF compression java
Java mail: sending email without SMTP
How to check if file exist when downloading from FTP
Maven - synch “main” folder with “tests” folder
How can I pass through the 'webservice credentials' to my webservice endpoint?
no command line argument

Categories

HOME
variables
gitlab
keycloak
omnet++
zeromq
c#-4.0
relative-path
view
relayjs
cvs
filtering
survey
baqend
windows-server
pivotal-cloud-foundry
azure-storage-tables
gorm
windows-10-universal
tomcat6
spring-kafka
spring-tool-suite
alignment
remote-access
static-libraries
hex-editors
visual-studio-cordova
hapi
propel
reactcsstransitiongroup
viewport
lldb
dbext
reverse-proxy
facebook-instant-articles
tapestry
kudan
devops
jndi
hammerspoon
p-value
h2db
opennlp
buildbot
bootstrap-duallistbox
spring-mybatis
buck
nouislider
stormpath
semantic-versioning
pdb
commit
hybridauth
textmate
mapbox-gl
html5-fullscreen
android-ble
dosbox
space-complexity
libraries
twitch
trim
squib
graphenedb
ruby-on-rails-3.1
reportingservices-2005
jvm-languages
isbn
flashair
cubic-spline
outlook-2013
atomicity
dds
pnotify
rdfs
mu
yii2-extension
diagnostics
media-player
qtwebview
imanage
quartz-composer
eventkit
azure-sdk
dstu2-fhir
etsy
django-debug-toolbar
qpid
disque
persist
pagedlist
moveit
bluegiga
livequery
php-internals
associative-array
freedesktop.org
spim
icu4j
qdialog
simple-framework
computer-algebra-systems
android-listview
ibaction
responsive-images
pretty-print
clicktag
qpainter
knuth
winddk
relocation
file-copying
websocket4net
monomac
flexmojos
apc
ril
random-seed
eventlistener
robotics-studio
factory-method
viewswitcher
assembly-loading
blackberry-playbook
appender
ecl
mysql-error-1005
datareader
zend-translate
wise
routedevent
database-management
backcolor
yagni
paul-graham

Resources

Encrypt Message



code
soft
python
ios
c
html
jquery
cloud
mobile