Feed aggregator

Spring Data Release Train Ingalls M1 Released

Javalobby Syndicated Feed - Fri, 29-Jul-16 22:31

On behalf of the Spring Data team, I’m happy to announce the first milestone of the Ingalls release train. The release ships 230 tickets fixed! The most noteworthy new features are:

  • Use of method handles for property access in conversion subsystem (Commons, MongoDB).
  • Upgrade to Cassandra 3.0 for Spring Data Cassandra.
  • Support for declarative query methods for Cassandra repositories.
  • Support for Redis geo commands.
  • Any-match mode for query-by-example.
  • Support for XML and JSON based projections for REST payloads (see the example for details)

Find a curated change log in our release train wiki or skim through a full list of changes in JIRA.

Categories: Java

5 Up-and-Coming Programming Languages to Know About

Javalobby Syndicated Feed - Fri, 29-Jul-16 03:31

Staying current in the programming field can sometimes make you feel like the Red Queen in “Alice Through the Looking-Glass.” She said, “It takes all the running you can do to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!”

You’re a master at Ruby on Rails? Great, but how are you with statistical analysis in R? Want to work at Google? Forget Python and start building channels in Go.

Categories: Java

This Week in Spring: Spring Cloud M1 Available

Javalobby Syndicated Feed - Fri, 29-Jul-16 03:31

Welcome to another installation of This Week in Spring! This week I’m mostly in San Francisco and Saint Louis, busily preparing for the big event next week!

This is my favorite time of year! As we lead to SpringOne Platform, there’s so much good stuff being released that one can hardly keep up! I am really looking forward to this year’s SpringOne Platform show, coming in early August. It’s an amazing time to build applications, and SpringOne Platform is in a unique position to capture the larger discussion: why do we #devops, #cloud, #agile, or #microservice? Join the discussion August 1-4, 2016 in beautiful Las Vegas and let’s find out!

Categories: Java

ActiveRecord Is Even Worse Than ORM

Javalobby Syndicated Feed - Fri, 29-Jul-16 00:38

You probably remember what I think about ORM, a very popular design pattern. In a nutshell, it encourages us to turn objects into DTOs, which are anemic, passive, and not objects at all. The consequences are usually dramatic — the entire programming paradigm shifts from object-oriented to procedural. I've tried to explain this at a JPoint and JEEConf this year. After each talk, a few people told me that what I'm suggesting is called ActiveRecord or Repository patterns.

Moreover, they claimed that ActiveRecord actually solves the problem I've found in ORM. They said I should explain in my talks that what I'm offering (SQL-speaking objects) already exists and has a name: ActiveRecord.

Categories: Java

Anticipated Impact of Java 9

Javalobby Syndicated Feed - Thu, 28-Jul-16 23:01

To gather insights on the state of the Java ecosystem today for DZone's Java Ecosystem research guide to be published in September, we spoke with 15 executives who are familiar with the Java ecosystem.

Here’s who we talked to:

Categories: Java

Two Kinds of Simplicity

Javalobby Syndicated Feed - Thu, 28-Jul-16 22:31

I was reading Niklaus Wirth's On the Design of Programming Languages and was struck by his discussion of simplicity. It appeared to me to apply to a number of concepts in architecture and design beyond just programming languages, and even to explain why we so often disagree about design choices.

One of the key insights of his paper is that there are multiple kinds of simplicity. He gives the example of generalizing all of the different data types into untyped values. He points out that this simplicity through generality makes it easier to write compilers, but shifts the onus to the programmer to make sure that type-less values are used correctly. So it's really a tradeoff of one kind of simplicity for another.

Categories: Java

Simple Form Login Page With Apache Sling

Javalobby Syndicated Feed - Thu, 28-Jul-16 05:31

In this post we will show a simple example on how to create a login page with Apache Sling and define which paths will ask for the user to login to the site using our login page.

In order to execute this, we need to have the Sling Lauchpad running on our machine. For this we can download the binary from the Apache Sling website in Downloads Section. If you have Docker installed on your machine you can use the Sling docker image.

Categories: Java

Signing Java Code With Certum Open Source Certificates

Javalobby Syndicated Feed - Thu, 28-Jul-16 00:31

Recent versions of Java have tightened the security around code that is run from the web, which is good news for end users, but a pain for developers. These new requirements forced users of a new web start Java project that I have worked on to manually add exceptions for the location of the JAR files being downloaded. This sounds like a simple thing to do, but turned out to be quite cumbersome, so I decided to get a real certificate.

Certum offers reasonably priced certificates for those working on open source projects. This is a great service for those who can’t justify hundreds of dollars a year on a certificate for code that they give away, and I bought one of these certificates for my own project.

Categories: Java

Queries and Aggregations With Scala: Part 2

Javalobby Syndicated Feed - Thu, 28-Jul-16 00:01

As we saw in part 1, it took 2+ minutes, on my laptop, to run an aggregation across 6+ million crime cases.

However, since Hazelcast is a cluster library, it only makes sense to take advantage of that. Let’s run this on a 3-node cluster. That will require a few code changes to CrimeNode:

Categories: Java

Spark 2.0 – Datasets and case classes

codecentric Blog - Wed, 27-Jul-16 23:26

The brand new major 2.0 release of Apache Spark was given out two days ago. One of its features is the unification of the DataFrame and Dataset APIs. While the DataFrame API has been part of Spark since the advent of Spark SQL (they replaced SchemaRDDs), the Dataset API was included as a preview in version 1.6 and aims at overcoming some of the shortcomings of DataFrames in regard to type safety.

This post has five sections:

  • The problem (roughly): States the problem in a rough fashion.
  • DataFrames versus Datasets: Quick recall on DataFrames and Datasets.
  • The problem (detailed): Detailed statement of the problem.
  • The solution: Proposes a solution to the problem.
  • It concludes with a Summary.

The problem (roughly)

The question this blog post addresses is roughly (for details see below): Given a Dataset, how can one append a column to it containing values derived from its columns without passing strings as arguments or doing anything else that would spoil the type safety the Dataset API can provide?

DataFrames versus Datasets

DataFrames have their origin in R and Python (Pandas), where they have proven to give a concise and practical programming interface for working with tabular data with a fixed schema. Due to the popularity of R and Python among Data Scientists, the DataFrame concept already has a certain degree of familiarity within these circles. Something that certainly allowed Spark to gain more users coming from this side. But the advantages of DataFrames do not only exist on the API side. There are also significant performance improvements as opposed to plain RDDs due to the additional structure information available which can be used by Spark SQL and Spark’s own Catalyst Optimizer.

Within the DataFrame API a tabular data set used to be described as an RDD consisting of rows with a row being an instance of type Array[Any]. Thus DataFrames basically do not take the data types of the column values into account. In contrast to this, the new Dataset API allows modelling rows of tabular data using Scala’s case classes.

While DataFrames are more dynamic in their typing, Datasets combine some of the benefits of Scala’s type checking with those of DataFrames. This can help to spot errors at an early stage but certain operations (see next section for an example) on Datasets still rely on passing column names in as String arguments rather than working with fields of an object.

This raises the question whether some of these operations can also be expressed within the type safe parts of the Datasets API alone, thus keeping the newly gained benefits of using the type system. As we will see in a particular example this requires some discipline and working with traits to circumvent a problem with inheritance that arises with case classes.

The problem (detailed)

The first lines of our exemplary CSV file bodies.csv look as follows:

id width height depth material color
1 1.0 1.0 1.0 wood brown
2 2.0 2.0 2.0 glass green
3 3.0 3.0 3.0 metal blue

Reading CSV files like this becomes much easier beginning with Spark 2.0. A SparkSession provides a fluent API for reading and writing. We can do as follows:

val df: DataFrame = spark.read
                         .schema(schema)
                         .option("header", true)
                         .csv("/path/to/bodies.csv")

Spark is able to infer the schema automatically in most cases by passing two times over the input file. In our case it would infer all columns as of type String. To help with that, we programmatically declare the schema as follows before the above code:

val id       = StructField("id",       DataTypes.IntegerType)
val width    = StructField("width",    DataTypes.DoubleType)
val height   = StructField("height",   DataTypes.DoubleType)
val depth    = StructField("depth",    DataTypes.DoubleType)
val material = StructField("material", DataTypes.StringType)
val color    = StructField("color",    DataTypes.StringType)
 
val fields = Array(id, width, height, depth, material, color)
val schema = StructType(fields)

DataFrames outperform plain RDDs across all languages supported by Spark and provide a comfortable API when it comes to working with structured data and relational algebra. But they provide weak support when it comes to types. There are mainly two reasons:

  1. For one thing, many operations on DataFrames involve passing in a String. Either as column name or as query. This is prone to error. For example df.select(“colour”) would pass at compile time and would only blow a likely long running job at run time.
  2. A DataFrame is basically a RDD[Row] where a Row is just an Array[Any].

Spark 2.0 introduces Datasets to better address these points. The take away message is that instead of using type agnostic Rows, one can use Scala’s case classes or tuples to describe the contents of the rows. The (not so) magic gluing is done by using as on a Dataframe. (Tupels would match by position and also lack the possibility to customize naming.)

final case class Body(id: Int, 
                      width: Double, 
                      height: Double, 
                      depth: Double, 
                      material: String, 
                      color: String)
 
val ds = df.as[Body]

The matching between the DataFrames columns and the fields of the case class is done by name and the types should match. In summary, this introduces a contract and narrows down possible sources of error. For example, one immediate benefit is that we can access fields via the dot operator and get additional IDE support:

val colors = ds.map(_.color) // Compiles!
ds.map(_.colour)             // Typo - WON'T compile!

Further, we can use this feature and the newly added type-safe aggregation functions to write queries with compile time safety:

import org.apache.spark.sql.expressions.scalalang.typed.{
  count => typedCount, 
  sum => typedSum}
 
ds.groupByKey(body => body.color)
  .agg(typedCount[Body](_.id).name("count(id)"),
       typedSum[Body](_.width).name("sum(width)"),
       typedSum[Body](_.height).name("sum(height)"),
       typedSum[Body](_.depth).name("sum(depth)"))
  .withColumnRenamed("value", "group")
  .alias("Summary by color level")
  .show()

If we wanted to compute the volume of all bodies, this would be quite straightforward in the DataFrame API. Two solutions come to mind:

// 1. Solution: Using a user-defined function and appending the results as column
val volumeUDF = udf {
 (width: Double, height: Double, depth: Double) => width * height * depth
}
 
ds.withColumn("volume", volumeUDF($"width", $"height", $"depth"))
 
// 2. Solution: Using a SQL query
spark.sql(s"""
           |SELECT *, width * height * depth
           |AS volume
           |FROM bodies
           |""".stripMargin)

But this would throw us back again to working with strings again. How could a solution with case classes look like? Of course, more work might be involved here but keeping type support could be a rewarding benefit in crucial operations.

While case classes are convenient in many regards they do not support inheritance (Link). So we cannot declare a case class BodyWithVolume that extends Body with an additional volume field. Assuming we had such a class, we could do this:

ds.map { 
 body => 
  val volume = body.width * body.height * body.depth
  BodyWithVolume(body.id, body.width, body.height, body.depth, body.material, body.color, volume)
}

This would of course solve our problem of adding the volume as new field and mapping a Dataset onto a new Dataset but as said, case classes do not support inheritance. Of course, no one could prevent us from declaring the classes Body and BodyWithVolume independently without the latter extending the former. But this certainly feels awkward given their close relationship.

The solution

Are we out of luck? Not quite. We can extend both classes starting from some common traits:

trait IsIdentifiable {
 def id: Int
}
 
trait HasThreeDimensions {
 def width: Double
 def height: Double
 def depth: Double
}
 
trait ConsistsOfMaterial {
 def material: String
 def color: String
}
 
trait HasVolume extends HasThreeDimensions {
 def volume = width * height * depth
}
 
final case class Body(id: Int, 
                      width: Double, 
                      height: Double, 
                      depth: Double, 
                      material: String, 
                      color: String) extends 
                      IsIdentifiable with 
                      HasThreeDimensions with 
                      ConsistsOfMaterial
 
final case class BodyWithVolume(id: Int, 
                                width: Double, 
                                height: Double, 
                                depth: Double, 
                                material: String, 
                                color: String) extends 
                                IsIdentifiable with 
                                HasVolume with 
                                ConsistsOfMaterial

This indisputably introduces more code. Still, if one takes into account that at a later stage there might be a need to compute densities for bodies, etc., this might be a good foundation, especially if type safe queries might be a concern.

Another limitation one certainly has to face, when working with Datasets is that currently a case class can only have 22 parameters, making it hard to work with say CSV files having 23 columns. The same holds for tuples instead of case classes.

Inheritance vs. composition

Of course, another solution could consist in using composition instead of inheritance. That is, we could work with Body as above and in addition final case class BodyWithVolume(body: Body, volume: Double). This would nest things.

Summary

In this blog post, we had a quick glimpse at the new Dataset API, saw how one can create a Dataset from a CSV file and then perform basic operations on it using the dot operator and transformations like map. We also saw how the Dataset API allows to write type safe aggregations. Finally, we discussed how traits can help to model relations between case classes, which in turn can be used to add new derived columns to a given Dataset.

Links:

  1. Official Spark 2.0 release note, which also list the added features
  2. Spark’s Programming guide: Datasets and DataFrames
  3. Databricks blog: Introducing Apache Spark Datasets
  4. Databricks blog: A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets
  5. Stackoverflow: What is so wrong with case class inheritance?

The post Spark 2.0 – Datasets and case classes appeared first on codecentric Blog.

Categories: Agile, Java, TDD & BDD

How the Java Community Process Has Changed and Where It's Headed

Javalobby Syndicated Feed - Wed, 27-Jul-16 23:01

To gather insights on the state of the Java ecosystem today for DZone's Java Ecosystem research guide to be published in September, we spoke with 15 executives who are familiar with the Java ecosystem.

Here’s who we talked to:

Categories: Java

Java Code Challenge: Chemical Symbol Naming-Part One

Javalobby Syndicated Feed - Wed, 27-Jul-16 22:31

Java Code Challenge is a new regular segment taking the best challenge from Reddit's dailyprogrammer. Things are a little different here as we're focused on Java. A working solution is not enough; we're looking for the cleanest Java code with tests. 3rd party libraries are welcome but if you can do it without it will be easier for others to comprehend.

If you can fit your solution in the comments then go for it, but preferably put your answer in GitHub and link in the comments. Next week we'll be sharing the best solutions and sharing the best code practices we see!

Categories: Java

Pattern Implementations

Javalobby Syndicated Feed - Wed, 27-Jul-16 20:31

Goals

The “implementation” or “examples” section of any pattern discussion holds several goals (and a few “anti-goals”):

  • Be able to put a concrete-ish example in front of people seeking such. It’s hard to understand exactly how the pattern is supposed to work without pictures or code. I am not great with graphical tools, so for me it’s easier to use code to provide that demonstration.
  • NOT to expect “ready-made code” for reuse. Patterns are not drop-in building blocks that can save you time and energy when doing your own implementation. These examples are here to demonstrate a few techniques around implementation, but attempts to re-use the code directly will probably always meet with failure at some level.
  • Demonstrate how to do certain things idiomatically within a particular language. Each language brings with it a particular idiomatic style or feature set that may shape how one might use a particular pattern. I am not an expert with all of these languages below, but part of this exercise (for me) is to learn (and document) how to exercise the idioms of a particular language using the pattern as a scaffold upon which to hang it.
  • Provide a mechanism by which to concretely compare and contrast one pattern against another. Is a Strategy really all that close to a Command? Having concrete examples of each allows for a certain amount of comparison-and-contrast, and hopefully sparks some good discussion around when to use each.
  • NOT to suggest that one language is “better” than another. Any such qualitative judgment around one language over another is entirely in the eyes of the beholder; no such judgement is intended from me, and any attempt to use this exercise as a means to judge one language more harshly than another will quickly earn this author’s scorn. Different languages chose to do things in different ways for very good reasons; if you cannot explain the reasons, then you have no business offering up the judgement.

Languages

In general, there’s a long list of languages I will use to define some example implementations of the patterns in the catalog. Note that while this isn’t an “ordered” list, meaning I will probably do implementations in a seemingly-random order, the hope is that when this is all said and done, the list of pattern implementations will range across the following:

Categories: Java

Gradle Goodness: Set VCS for IntelliJ IDEA in Build File [Snippet]

Javalobby Syndicated Feed - Wed, 27-Jul-16 05:31

When we use the IDEA plugin in Gradle we can generate IntelliJ IDEA project files. We can customise the generated files in different ways. One of them is using a simple DSL to configure certain parts of the project file. With the DSL it is easy to set the version control system (VCS) used in our project.

In the next example build file we'll customize the generated IDEA project file and set Git as the version control system. The property is still incubating, but we can use it to have a proper configuration.

Categories: Java

What’s New in Intellij Idea 2016.2 for Spring Developers

Javalobby Syndicated Feed - Wed, 27-Jul-16 02:31

If you’re a heavy user of Spring and IntelliJ IDEA, you’ll find the IntelliJ IDEA 2016.2 update very interesting. Besides many important bugfixes, the update adds support for Spring 4.3 and Spring Security 4.0 features and cache abstraction, improves support for Spring MVC, and further tunes performance. Before reading about this in more detail, make sure you’ve registered for our webinar dedicated to the new release and Spring framework.

OK, now back to the update. First of all, the IDE now supports generics types as qualifiers and provides navigation for them:

Categories: Java

Programmer Stereotypes By Language Community

Javalobby Syndicated Feed - Wed, 27-Jul-16 02:31

In the recent article Humble Lisp Programmers, author John Cook mentions that a stereotypical Lisp programmer "does look down on everyone", but clarifies that this trait may be more accurately attributed to those who write about Lisp programming and not necessarily those who are Lisp programmers. The article got me thinking about other stereotypes held by the industry pertaining to specific language camps, and my interactions over the years with members of those camps. I spent most of my career focusing on recruiting Java talent, but for the last several years I have had searches across a wide range of skills.

Y Combinator founder Paul Graham has published several posts that at least allude to stereotypes and/or the relationships between programmers and the languages they use. Notably, Graham's Java's Cover and The Python Paradox touch on these subjects. His later piece Great Hackers asserts that language choice for a project is a "social decision" as much as a technical one, and reiterates his earlier suggestion that Python projects will attract better talent than Java projects. Perhaps Graham, who is not-so-coincidentally synonymous with Lisp, was on Cook's mind while musing the stereotype. 

Categories: Java

Testing Java Libraries From ScalaCheck

Javalobby Syndicated Feed - Wed, 27-Jul-16 00:31

In this article I would like to show you how to integrate ScalaCheck into a Maven project in order to test your Java classes. I will use as an example the PhoneNumber class as seen in Item 9 in the book Effective Java. The idea will be to test that the equals method implementation on that class conforms to the equals contract according to the Java Specification (more on that later). Also, we will be testing a couple of Netty handlers that could be used to Encode/Decode a PhoneBook object in order to send it on the network.

The finished project can be found on this GitHub repo: https://github.com/videlalvaro/phone-guide

Categories: Java

Java 8's Impact to Date

Javalobby Syndicated Feed - Tue, 26-Jul-16 23:01

To gather insights on the state of the Java ecosystem today for DZone's Java Ecosystem research guide to be published in September, we spoke with 15 executives who are familiar with the Java ecosystem.

Here’s who we talked to:

Categories: Java

When Things May Get Out of Control: Circuit Breakers in Practice

Javalobby Syndicated Feed - Tue, 26-Jul-16 22:31

In the previous post we have started the discussion about circuit breakers and why this pattern gained so much importance these days. We have learned about Netflix Hystrix, the most advanced circuit breaker implementation for JVM platform, and its typical integration scenarios. In this post we are going to continue exploring the other options available, starting from Apache Zest library.

Surprisingly, Apache Zest being certainly a gem, is not well-known and widely used. It is a framework for domain centric application development which aims to explore composite-oriented programming paradigm. Its roots go back to 2007, where it was born under another name, Qi4j (and became Apache Zest in 2015). It would require a complete book just to go through Apache Zest features and concepts, but what we are interested in is the fact that Apache Zest has simple circuit breaker implementation.

Categories: Java

StackOverflow: Seven of the Best Java Answers That You Haven’t Seen

Javalobby Syndicated Feed - Tue, 26-Jul-16 21:16

This post is by Henn Idan on the Takipi blog.

StackOverflow is a gold mine for developers. It helps us find the most useful answers to specific issues we encounter, and we always find ourselves learning new things from it.

Categories: Java

Thread Slivers eBook at Amazon

Syndicate content