Monthly Archives: November 2013

The Machine Learning Journey: An introduction to linear regression – Cost Function (ML for the Layman)

I have just stumbled upon a very good introduction to the cost function in Machine Learning.

The Machine Learning Journey: An introduction to linear regression – Cost Function (ML for the Layman).

Tutorials and Experiments (first round)

I have just posted my first tutorial on linear regression and I will be soon working on one on multiple regression. Although there is a considerable overlapping between Statistics and Machine Learning (I would add Data Mining, too), I would like to keep “connecting the dots” between the two disciplines. I will be preparing a tutorial called “Machine Learning in 5 minutes” where I will discuss supervised learning methods for classification and regression. Stay tuned!

Transforming Solr responses using XSLT

I have recently been working with a SolrJ-based Java client that accesses external Solr instances and I have came across an issue with multivalued Solr fields.  Multivalued fields can take more than one value, hence they can store an array of values. More simplistically, multivalued fields can be seen as regular fields with multiple values concatenated together.

In my case, the Solr schema was set up with the following configurations for the field title:

[code language=”css”]
<field name="title" type="text_general" indexed="true" stored="true" <strong>multiValued="true"</strong>/>
[/code]

Consequently, the XML response from Solr displayed square brackets at the beginning and at the end of each title:

[code language=”css”]
<title>[some title]</title>
[/code]

The initial code for transforming the Solr response extracted the value of the field title by using

[code language=”css”]<xsl:value-of select>[/code]

and the attribute

[code language=”css”]select[/code]

The Xpath expression

[code language=”css”]field[@name=’title’][/code]

then selected the attribute of the original nodes with the given attribute name ‘title’ and it retains the square brackets for array values.

[code language=”css”]

<title>
<xsl:variable name="title">
<xsl:choose>
<xsl:when test="string-length(field[@name=’title’]) &gt; 0">
<xsl:value-of select="field[@name=’title’]" />
<xsl:otherwise>
<xsl:value-of select="field[@name=’text’]" />
</xsl:otherwise>
</xsl:when>
</xsl:choose>
</xsl:variable>
<!–handle empty title–>
<xsl:choose>
<xsl:when test="string-length($title) = 0">
Untitled
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$title"/>
</xsl:otherwise>
</xsl:choose>
</title>
[/code]

I came up with the following solution to remove the square brackets by using the XSLT functions substring-before and substring-after. They allowed me to be returned with the substring of the field title without the left and right brackets:

[code language=”css”]
<title>
<xsl:variable name="title">
<xsl:choose>
<xsl:when test="string-length(field[@name=’title’]) &gt; 0">
<xsl:value-of select="substring-before(substring-after(field[@name=’title’],'[‘),’]’)" />
<xsl:otherwise>
<xsl:value-of select="field[@name=’text’]" />
</xsl:otherwise>
</xsl:when>
</xsl:choose>
</xsl:variable>
<!–handle empty title–>
<xsl:choose>
<xsl:when test="string-length($title) = 0">
Untitled
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$title"/>
</xsl:otherwise>
</xsl:choose>
</title>
[/code]