<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Piecing things together (Posts about clojure)</title><link>https://wjoel.com/</link><description></description><atom:link href="https://wjoel.com/categories/clojure.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><lastBuildDate>Sun, 15 Oct 2023 18:25:37 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Streaming Wikipedia edits with Spark and Clojure</title><link>https://wjoel.com/posts/spark-streaming-wikipedia-in-clojure.html</link><dc:creator>Joel Wilsson</dc:creator><description>&lt;div&gt;&lt;div id="outline-container-orge6b3df7" class="outline-2"&gt;
&lt;h2 id="orge6b3df7"&gt;Wikipedia edits and IRC&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-orge6b3df7"&gt;
&lt;p&gt;
The Wikimedia project has an IRC server with &lt;a href="https://meta.wikimedia.org/wiki/IRC/Channels#Raw_feeds"&gt;raw feeds&lt;/a&gt; of changes to the
different Wikimedia wikis, and we can join &lt;code&gt;#en.wikipedia&lt;/code&gt; to see all edits to
the English language Wikipedia in real-time.
&lt;/p&gt;

&lt;p&gt;
The &lt;a href="https://flink.apache.org/"&gt;Apache Flink&lt;/a&gt; project has a &lt;a href="https://github.com/apache/flink/blob/master/flink-contrib/flink-connector-wikiedits/src/main/java/org/apache/flink/streaming/connectors/wikiedits/WikipediaEditsSource.java"&gt;streaming source&lt;/a&gt; for those edits, which is great
for getting started with streaming data processing. &lt;a href="https://spark.apache.org/"&gt;Apache Spark&lt;/a&gt; does not have
such a source, so we'll make one. To be precise, we'll implement a Spark
Streaming &lt;a href="https://spark.apache.org/docs/latest/streaming-custom-receivers.html"&gt;custom receiver&lt;/a&gt;, and use &lt;code&gt;clj-bean&lt;/code&gt; from the previous post to
do it.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id="outline-container-orgf7b25d2" class="outline-2"&gt;
&lt;h2 id="orgf7b25d2"&gt;The Receiver interface&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-orgf7b25d2"&gt;
&lt;p&gt;
We only need to implement three methods: &lt;code&gt;onStart&lt;/code&gt;, &lt;code&gt;onStop&lt;/code&gt; and &lt;code&gt;receive&lt;/code&gt;,
but first we need to create a Java class with &lt;code&gt;gen-class&lt;/code&gt;.
&lt;/p&gt;

&lt;p&gt;
Since we want to play nice with the rest of the JVM ecosystem
(Java, Scala, etc.) and Clojure doesn't support interfaces with generics,
we specify &lt;code&gt;AbstractWikipediaEditReceiver&lt;/code&gt; in Java.
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kd"&gt;public&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;abstract&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AbstractWikipediaEditReceiver&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;extends&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Receiver&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;WikipediaEditEvent&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kd"&gt;public&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;AbstractWikipediaEditReceiver&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;StorageLevel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;storageLevel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="kd"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;storageLevel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;
From the &lt;a href="https://wjoel.com/posts/clojure-java-beans.html"&gt;previous post&lt;/a&gt; you may recall that &lt;code&gt;:constructors&lt;/code&gt; is a map from
the types of our constructor methods to the types of the superclass's
constructor. We have three constructors, one without arguments in which
case we'll choose a random nickname when connecting, one with the nickname
specified, and one with both the nickname and the &lt;a href="http://spark.apache.org/docs/2.1.0/api/java/index.html?org/apache/spark/storage/StorageLevel.html"&gt;storage level&lt;/a&gt; for our
RDDs.
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:gen-class&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;:name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;com.wjoel.spark.streaming.wikiedits.WikipediaEditReceiver&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;:extends&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;com.wjoel.spark.streaming.wikiedits.AbstractWikipediaEditReceiver&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;:init&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;init&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;:state&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;state&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;:prefix&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"receiver-"&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;:constructors&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{[]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;org.apache.spark.storage.StorageLevel&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;                &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;String&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;org.apache.spark.storage.StorageLevel&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;                &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;org.apache.spark.storage.StorageLevel&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;org.apache.spark.storage.StorageLevel&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;:main&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;
The IRC library we use has a default adapter for receiving events. All the
messages we are interested in will be sent as private messages, so we can
use &lt;a href="https://clojuredocs.org/clojure.core/proxy"&gt;proxy&lt;/a&gt; to implement only &lt;code&gt;onPrivMsg&lt;/code&gt; to call a message handling function
for each message received.
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;make-irc-events-listener&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;message-fn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;proxy &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;IRCEventAdapter&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;onPrivmsg&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;user&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;message-fn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;
It's easy but tedious to connect to the IRC server, but once we have a
connection we can add this listener and use the &lt;code&gt;store&lt;/code&gt; method of our
&lt;code&gt;Receiver&lt;/code&gt; to pass the events back to Spark.
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.addIRCEventListener&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-irc-events-listener&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;when-let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;edit-event&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;edit-event-message-&amp;gt;edit-event&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.store&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;edit-event&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;
&lt;code&gt;edit-event-message-&amp;gt;edit-event&lt;/code&gt; uses a regexp to extract the different
fields from the message and create a &lt;code&gt;WikipediaEditEvent&lt;/code&gt; JavaBean which
we have created using &lt;a href="https://github.com/wjoel/clj-bean"&gt;clj-bean&lt;/a&gt;, as described at the end of the previous
post.
&lt;/p&gt;

&lt;p&gt;
We need to start a thread in &lt;code&gt;onStart&lt;/code&gt; and clean up in &lt;code&gt;onStop&lt;/code&gt;.
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;connect-as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;com.wjoel.spark.streaming.wikiedits.WikipediaEditReceiver&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;nick&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;when-let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;conn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;IRCConnection.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;wikimedia-irc-host&lt;/span&gt;
&lt;span class="w"&gt;                                  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;int-array&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;wikimedia-irc-port&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;nick&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;nick&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;nick&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.put&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;java.util.HashMap&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.state&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="s"&gt;"connection"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;init-connection&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;receiver-onStart&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;com.wjoel.spark.streaming.wikiedits.WikipediaEditReceiver&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.start&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;Thread.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="w"&gt;                     &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;connect-as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-from-state&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;:nick&lt;/span&gt;&lt;span class="p"&gt;))))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;receiver-onStop&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;com.wjoel.spark.streaming.wikiedits.WikipediaEditReceiver&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;conn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;IRCConnection&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-from-state&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;:connection&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;when &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="nv"&gt;conn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.isConnected&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;doto &lt;/span&gt;&lt;span class="nv"&gt;conn&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.send&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str &lt;/span&gt;&lt;span class="s"&gt;"PART "&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;wikimedia-channel&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.interrupt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;.join&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://wjoel.com/posts/spark-streaming-wikipedia-in-clojure.html"&gt;Read more…&lt;/a&gt; (1 min remaining to read)&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</description><category>clojure</category><category>spark</category><guid>https://wjoel.com/posts/spark-streaming-wikipedia-in-clojure.html</guid><pubDate>Tue, 21 Feb 2017 21:03:00 GMT</pubDate></item><item><title>Creating JavaBeans with Clojure</title><link>https://wjoel.com/posts/clojure-java-beans.html</link><dc:creator>Joel Wilsson</dc:creator><description>&lt;div&gt;&lt;div id="outline-container-org863ff58" class="outline-2"&gt;
&lt;h2 id="org863ff58"&gt;Introduction&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-org863ff58"&gt;
&lt;p&gt;
&lt;a href="https://en.wikipedia.org/wiki/JavaBeans"&gt;JavaBeans&lt;/a&gt; have been around since forever in the Java world. They're well
supported, but not well designed, as you can see from the list of
disadvantages on Wikipedia. Unfortunately, we're stuck with them. Frameworks
like &lt;a href="https://spark.apache.org/"&gt;Apache Spark&lt;/a&gt; and others give us nice things in return for those beans.
To create a library which is usable from Java and Scala and is compatible with
Spark we must be able to create proper JavaBeans (perhaps - we'll get back to
this later).
&lt;/p&gt;

&lt;p&gt;
Hence, we may need to follow the JavaBean standard.
The requirements are simple.
&lt;/p&gt;

&lt;ul class="org-ul"&gt;
&lt;li&gt;Getters and setters, also known as accessors, for all fields.
Those are the typically verbose
Java methods like &lt;code&gt;long getSomeLongField()&lt;/code&gt; and
&lt;code&gt;void setSomeLongField(long value)&lt;/code&gt; which we perhaps wanted to escape from
by moving to Clojure. The setters imply  that our instance
fields must be mutable.&lt;/li&gt;
&lt;li&gt;JavaBean classes need to implement &lt;code&gt;java.io.Serializable&lt;/code&gt; but this is
usually easy. We just need to specify that we implement this
interface and make sure the types of our fields also implement
the interface.&lt;/li&gt;
&lt;li&gt;A nullary constructor, ie. a constructor that takes zero arguments.
This one is a bit difficult for Clojure.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id="outline-container-orge8a5b93" class="outline-2"&gt;
&lt;h2 id="orge8a5b93"&gt;JavaBeans through deftype&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-orge8a5b93"&gt;
&lt;p&gt;
There are &lt;a href="https://clojure.org/reference/java_interop#_implementing_interfaces_and_extending_classes"&gt;several&lt;/a&gt; &lt;a href="https://clojure.org/reference/datatypes"&gt;ways&lt;/a&gt; to create Java classes from Clojure.
This &lt;a href="https://stackoverflow.com/questions/7142495/why-does-clojure-have-5-ways-to-define-a-class-instead-of-just-one"&gt;can be confusing&lt;/a&gt;, but in our case we can directly rule out
&lt;code&gt;defrecord&lt;/code&gt; since it only supports immutable fields.
&lt;/p&gt;

&lt;p&gt;
Our options are &lt;code&gt;deftype&lt;/code&gt; and &lt;code&gt;gen-class&lt;/code&gt;. We'll start with &lt;code&gt;deftype&lt;/code&gt;
since it's easier to work with. Mutable fields can be
created by specifying the &lt;code&gt;:volatile-mutable true&lt;/code&gt; metadata, and we need to use
&lt;code&gt;definterface&lt;/code&gt; to specify our accessor methods.
&lt;/p&gt;

&lt;p&gt;
We can use &lt;code&gt;deftype&lt;/code&gt; to implement a JavaBean for edits to Wikipedia
(we'll have more to say about this in the future).
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;definterface &lt;/span&gt;&lt;span class="nv"&gt;IWikiEdit&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;Long&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;getTimestamp&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;setTimestamp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;Long&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;getTitle&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;setTitle&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;title&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nb"&gt;long &lt;/span&gt;&lt;span class="nv"&gt;getByteDiff&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;setByteDiff&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="nv"&gt;Long&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;byte-diff&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;deftype &lt;/span&gt;&lt;span class="nv"&gt;DeftypeEditEvent&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="ss"&gt;:volatile-mutable&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;true&lt;/span&gt;
&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="ss"&gt;:tag&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;java.lang.Long&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;timestamp&lt;/span&gt;
&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="ss"&gt;:volatile-mutable&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;true&lt;/span&gt;
&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="ss"&gt;:tag&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;java.lang.String&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;title&lt;/span&gt;
&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="ss"&gt;:volatile-mutable&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;true&lt;/span&gt;
&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="ss"&gt;:tag&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;java.lang.Long&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;byteDiff&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;java.lang.Object&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str &lt;/span&gt;&lt;span class="s"&gt;"DeftypeEditEvent; title="&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;title&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;", byteDiff="&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;byteDiff&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;IEditEvent&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;getTimestamp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;setTimestamp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;_&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set!&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;timestamp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;v&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;getTitle&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;title&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;setTitle&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;_&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set!&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;title&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;v&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;getByteDiff&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;byteDiff&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;setByteDiff&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;_&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set!&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;byteDiff&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;v&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;java.io.Serializable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;
We can then create &lt;code&gt;DeftypeEditEvent&lt;/code&gt; instances by using the
factory method or by calling the associated Java class constructor
directly.
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;ns &lt;/span&gt;&lt;span class="nv"&gt;bean.deftype-bean-test&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:require&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;bean.deftype-bean&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;:as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;"This is a bean:"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;dt/-&amp;gt;DeftypeEditEvent&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1483138282&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"hi"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;"This too:"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;bean.deftype_bean.DeftypeEditEvent.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1483138282&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"hi"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;
Unfortunately this is not a true JavaBean, because it doesn't have a nullary
constructor. &lt;code&gt;deftype&lt;/code&gt; only creates a single constructor which takes as many
arguments as there are fields. Can we create real JavaBeans in Clojure?
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id="outline-container-orgadea975" class="outline-2"&gt;
&lt;h2 id="orgadea975"&gt;gen-class to the rescue&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-orgadea975"&gt;
&lt;p&gt;
&lt;code&gt;gen-class&lt;/code&gt; supports a lot of features, including nullary constructors.
Due to its complexity &lt;code&gt;gen-class&lt;/code&gt; is generally not recommended, but
if we want nullary constructors for our JavaBeans
it's the only way - at least if we want to stick to pure Clojure.
We'll take care to avoid reflection by using type hints.
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://wjoel.com/posts/clojure-java-beans.html"&gt;Read more…&lt;/a&gt; (5 min remaining to read)&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</description><category>benchmarks</category><category>clojure</category><category>java</category><category>macros</category><guid>https://wjoel.com/posts/clojure-java-beans.html</guid><pubDate>Wed, 08 Feb 2017 20:42:00 GMT</pubDate></item></channel></rss>