Feed aggregator

jhsdb: A New Tool for JDK 9

Javalobby Syndicated Feed - Tue, 06-Jun-17 13:01

I like to use the command-line tools provided with the JDK in the early steps of analyzing performance and other issues with Java-based applications and have blogged on tools such as jcmd, jps, jstat, jinfo, jhat and jmap, jrunscript, jstack, and jdeps. JDK 9 is bringing new command-line tools with multiple tools specifically related to new JDK 9 features such as modularity (jlink and jmod) and enhanced deprecation (jdeprscan). In this post, I focus on a new command-line tool delivered with JDK 9 for dealing with performance and serviceability issues: jhsdb.

The jhsdb tool is described on its Oracle JDK 9 Documentation Early Access page, "You use the jhsdb tool to attach to a Java process or to launch a postmortem debugger to analyze the content of a core-dump from a crashed Java Virtual Machine (JVM)." The tool comes with several "modes" and several of these modes correspond in name and function with individual command-line tools available in previous JDK distributions. The jhsdb tool not only provides a single tool that encompasses functionality of multiple other tools, but it also provides a single, consistent approach to applying these different functions. For example, the jhsdb command-line syntax for getting help for each of the "modes" is identical.

Categories: Java

The Problems Facing the JPMS's Adoption

Javalobby Syndicated Feed - Tue, 06-Jun-17 09:01

The EC's "No" vote on the Java Platform Module System took plenty of people by surprise last month. Concerns wth the JPMS ranged from its implementation to fears over the perceived impact it would have on the community, as proposed.

Among plenty of onlookers, Reza Rahman of the Java EE Guardians has been keeping an eye on the proceedings as they've developed, both before and after the vote.

Categories: Java

What's Wrong With Hashcode in java.lang.String?

Javalobby Syndicated Feed - Tue, 06-Jun-17 03:01

One of the most significant criteria of every hash function is the tendency for collisions. Hash functions inside the JDK are not the exception. The main idea of a collision attack is finding two different messages, m1 and m2, such that hash(m1) = hash(m2). In this article, I would like to show that this problem is reproducible in every program written in Java and how to get around it.

Firstly, let's consider the internals of the hashcode function in java.lang.String:

Categories: Java

Deploying to Tomcat from Octopus Deploy

Javalobby Syndicated Feed - Tue, 06-Jun-17 00:01

Octopus Deploy has a large collection of useful steps (both included and community provided) that can be used to deploy packages to a variety of different destinations and via different methods.

Fortunately, these same deployment steps can be used to deploy Java packages to Java web servers running in Linux out of the box.

Categories: Java

Kofax Transformation Modules (KTM), AI and Machine Learning

codecentric Blog - Mon, 05-Jun-17 22:30

The topics AI, machine learning and deep learning are on everyone’s lips, and the media regularly publishes articles on them. What many do not know is that Kofax Transformation Modules (KTM) also provides mechanisms of machine learning. KTM is a system for automatic classification of documents and extraction of data fields (see also: Document classification with Kofax Transformation Modules).

KTM always included tools from machine learning, which can be used alone or together with the rule-based free-form recognition. This neural network-based methods will be briefly described here.

A KTM project consist of the following phases:

  • Project preparation: Document types, data fields, clustering
  • Project implementation: Classification and extraction design
  • Production: Capturing, classification, extraction, manual validation

Prior to extraction, the classification of the document has to be done because different types of documents normally have different extraction fields. Once the classification has been successfully carried out, the document type-specific field extraction can be started.

KTM provides tools from the field of machine learning for the project preparation as well as for the implementation of the project and the production phase in order to train the system and improve the quality of the results successively.

By training, learning systems recognize the context and store it for future use. KTM does not memorize the absolute position of a field, but saves the environment in which the field is located. This can include words which are located nearby (and their distances to the field), position to other fields, but also lines or similar objects. This newly learned context is immediately available when the next document is processed, and the field value can be extracted directly for a similar document – hopefully! “Hopefully” was inserted because such systems are not deterministic and some document types must be trained several times.

The KTM toolbox for machine learning contains of the following elements:

  • Clustering Tool: Get basic information about the document types, what are the main/important types?
  • Administrative training with examples of the main document types: Document type classification
  • Administrative training with examples of the main document types: Extraction of the field data
  • Production cycle: System learns by manual assignment of the document type
  • Production cycle: System learns by manual field correction/data entry

The Clustering Tool

At the beginning of a recognition project, it should first be clarified which documents promise the most “profit”. Which document types are worthwhile for training – and which should not be considered initially?

KTM includes a tool (clustering tool) that analyzes unsorted batches of documents and divides them into batches with similar characteristics. This sorting can be done according to graphical criteria as well as according to content. After using this tool you usually have a very good impression, which are the main document types of a project, which should be trained first.

This example shows, that one should first concentrate on the processing of the generated batches 1, 5 and 4. Part 4 contains 36 “CAR Parts Co-Delivery Note” documents.

Administrative training of document types

For this, you will use the main document types, that have been determined by the clustering tool. Within the KTM development environment, the document types are created manually or they can be created automatically from the batches. For each document type, an administrator assigns a number of sample documents to the system for learning. This number is  project-dependent, but in real projects a value of about 20 documents has proven itself. The training of the document types can take place via the layout and/or the text content of the documents.

The success of this training can be immediately checked using the non-trained examples of the document type batches.

In real life projects, I only trust the classification result obtained by learning when a certain confidence level has been reached (e.g. 80%). At lower values, additional document-type-specific rules are used to determine the document type.

Administrative training of field extraction

After the training of the document types, the extraction of the document type-specific data fields can be done in the next step. Similar to the training of the document types, a certain number of documents is taken per document type. Training is done by just showing the system the position on the document where the data for a field should be extracted. This is simply done by using mouse clicks. KTM does not memorize the absolute positions, but stores features (graphics, words, lines etc.) near the extraction position.

Again, the success of this training can be immediately checked using the non-trained examples of the document type batches.

Online Learning during production

After a pre-trained system has been set to production, KTM offers the possibility to further improve the classification and extraction during daily processing. This includes the optimization of main document types already trained in the preparation, but also the basic training and optimization of the previously neglected other document types.

The KTM validation module offers all documents for validation where the classification was unsure or data fields were unsure or empty after extraction. A user can manually correct the classification and/or the data fields and the document may be marked for online learning if desired.

After that, the original document goes into the further processing and a copy is sent to the KTM learning mechanism. Depending on the configuration of the KTM system, the system learns the changes directly, and these are available at the next processed batch, or an administrator must first check and release the new learning document.

The following diagram shows the flow of KTM processing and the integration of online learning:

However, direct online learning – without the control of an administrator – entails the risk that the system will learn incorrectly, since the person at the validation workplace directly releases a document for learning. Neural networks cannot be debugged like programs in classical development – there must be other ways to find the error (the wrongly trained document) and make corrections.

KTM provides a view of all trained documents per document type as well as the possibility to remove or reconfigure documents from a learning set. Nevertheless, one should not underestimate the effort for such a correction. Therefore the release of new learning documents should be done by an administrator or a specialist despite the delay in getting the new training set in production.

The post Kofax Transformation Modules (KTM), AI and Machine Learning appeared first on codecentric AG Blog.

Categories: Agile, Java, TDD & BDD

All About the Singleton

Javalobby Syndicated Feed - Mon, 05-Jun-17 21:01

The Singleton design pattern is one of the simplest design patterns: It involves only one class throughout the application that is responsible for instantiating itself to make sure it creates no more than one instance. At the same time, it provides a global point of access to that instance. In this case, the same instance can be used from everywhere, being impossible to directly invoke the constructor each time.

There are various kinds of implementations, and I am going to explain them one by one.

Categories: Java

Running a JVM in a Container Without Getting Killed

Javalobby Syndicated Feed - Mon, 05-Jun-17 13:01

No pun intended.

The JDK 8u131 has backported a nice feature in JDK 9, which is the ability of the JVM to detect how much memory is available when running inside a Docker container.

Categories: Java

Additional Considerations of the Java Ecosystem

Javalobby Syndicated Feed - Mon, 05-Jun-17 09:01

To gather insights on the state of the Java ecosystem today, we spoke to nine executives who are familiar with the ecosystem.

We asked these experienced Java professionals "What have I failed to ask you that you think we need to consider about the Java ecosystem?" Here's what they told us (or asked us in return):

Categories: Java

Spocklight: Indicate Specification as a Pending Feature

Javalobby Syndicated Feed - Mon, 05-Jun-17 03:01

Sometimes, we find ourselves working on a new feature in our code and we want to write a specification for it without yet really implementing the feature. To indicate we know the specification will fail while we are implementing the feature, we can add the @PendingFeature annotation to our specification method. With this annotation, Spock will still execute the test, but it will set the status to ignored if the test fails. But if the test passes, the status is set to failed. So when we have finished the feature, we need to remove the annotation — and Spock will kindly remind us to do so this way.

In the following example specification, we use the @PendingFeature annotation:

Categories: Java

An Intro to AssertJ and Collections

Javalobby Syndicated Feed - Mon, 05-Jun-17 00:01

When programming in Java, you often end up writing methods returning a collection of objects. They certainly have their place in your application, but testing them can be a little tricky. Depending on the implementation of the underlying collection, the order of the elements may be different, the equals on collections is not always obvious, and so on. I have come across multiple examples of such cases in my career, and I decided to pick a couple of them and show a way to tackle them with AssertJ, the assertions library you should definitely be using.

All the recipes follow the same format. I present a short description, followed by the implementation of the test. The test is always successful, to avoid confusion. Each recipe ends with a short remark, like what other situations can this be used for. I’m also using the Guava library to create the collections, as vanilla Java doesn’t really provide a way to do that.

Categories: Java

9 Logging Sins in Your Java Applications

Javalobby Syndicated Feed - Sun, 04-Jun-17 21:01

Logging runtime information in your Java application is critically useful for understanding the behavior of any app, especially in cases when you encounter unexpected scenarios, errors or just need to track certain application events.

In a real-world production environment, you usually don’t have the luxury of debugging. And so, logging files can be the only thing you have to go off of when attempting to diagnose an issue that’s not easy to reproduce.

Categories: Java

Parsing in Java (Part 1): Structures, Trees, and Rules

Javalobby Syndicated Feed - Sat, 03-Jun-17 23:01

If you need to parse a language, or document, from Java there are fundamentally three ways to solve the problem:

  • Use an existing library supporting that specific language: for example a library to parse XML.
  • Building your own custom parser by hand.
  • A tool or library to generate a parser: for example ANTLR, which you can use to build parsers for any language.

Use an Existing Library

The first option is the best for well-known and supported languages, like XML or HTML. A good library usually also includes an API to programmatically build and modify documents in that language. This is typically more of what you get from a basic parser. The problem is that such libraries are not so common and they support only the most common languages. In other cases, you are out of luck.

Categories: Java

20 Leaders' Thoughts on What Makes Great Java Devs

Javalobby Syndicated Feed - Fri, 02-Jun-17 21:01

Java remains one of the most popular programming languages. In our recent deep-dive into the hottest programming languages for 2017, Java landed second among the most-used programming languages and the languages with the most active repositories on GitHub – beat out only by JavaScript in both categories.

Aside from its widespread use, it’s also the most in-demand programming language among employers, with more job listings on Indeed (as of March 2017) seeking developers with Java skills than any other language. So naturally, if you’re one of the employers behind the 36,000+ job listings seeking Java developers or you’re a Java programmer looking for your next gig, the skills and characteristics that set great Java developers apart from the pack is crucial information. And if you’re a Java programmer looking for your next gig, knowing what skills you should level-up and what characteristics to promote to your prospective employers is good-to-know info.

Categories: Java

Using Java Flight Recorder Triggers [Video]

Javalobby Syndicated Feed - Fri, 02-Jun-17 13:01

A good amount has been written and said about Java Flight Recorder, its integration into the Oracle Java SE Java Virtual Machine (JVM) and the very low overhead associated with enabling the framework. It not only makes the notion of collecting detailed runtime information about a Java application in production a possibility — it makes it a reality.

Many opt to place a program in Java Flight Recorder's Continuous Recording Mode. In this state, the Java application will collect runtime data indefinitely, where you can specify (or default to) how much data you want to retain before overwriting. Once in this mode, you can at any time, with many different options, dump the runtime information into a self-contained Flight Recorder file. From there, the Java Mission Control tool can be used to open this file to further diagnose your application's behavior.

Categories: Java

This Week in Spring: Spring 5 and Spring Boot + Docker + Windows

Javalobby Syndicated Feed - Fri, 02-Jun-17 09:01

Hi, Spring fans! This week I’m in Chicago for the epic Spring Days Chicago and then I’m off to Singapore for VOXXED Singapore. We’ve got a lot to cover so let’s get to it!

Categories: Java

Spring 5 WebFlux and JDBC: To Block or Not to Block

Javalobby Syndicated Feed - Fri, 02-Jun-17 07:28

The 5th version of the Spring Framework brings a huge step forward in Functional and Reactive Programming support. You don’t need ApplicationContext or dozens of annotations to have the simplest REST API up and running. Spring 5 will offer lightweight Web Functions and reactive Web Flux support that can help this transition happen. Those new features make Java and Spring 5 good candidates for building reactive web applications

To be reactive, according to The Reactive Manifesto, you have to be Responsive, Resilient, Elastic, and Message Driven. The last criteria in this list caused big movement into the asynchronous way of communications. This includes asynchronous RPC and messaging libraries, database drivers, and more. RDBMSs are quite powerful and useful. The official instruments for database access on JVM provided by database vendors are drivers implementing JDBC API. But JDBC is designed to be blocking and consumes threads per database call. You will not find in the API itself methods or interfaces allowing you to get query results in another thread. There is also an opinion that a transactional database is not a fit for the reactive concept:

Categories: Java

Enforce Software Design With Checkstyle and QDox

Javalobby Syndicated Feed - Fri, 02-Jun-17 03:01

Developers care about the code they write. They build tools that enforce spaces instead of tabs, forbid 1-letter identifiers and ensure that every class and method has Javadoc comments. One example of such a tool is Checkstyle.

But usually, it’s not code style violations that make code hard to read and maintain. More often, it is higher level code organization (software design) – all the decisions made about classes, their responsibilities, connections between them, etc.

Categories: Java

Tips for Scripting Tasks With Bitbucket Pipelines

Javalobby Syndicated Feed - Fri, 02-Jun-17 00:01

With Bitbucket Pipelines, you can quickly adopt a continuous integration or continuous delivery workflow for your repositories. An essential part of this process is to turn manual processes into scripts that can be run automated by a machine without the need for human intervention. But sometimes it can be tricky to automate tasks as you might have some issues with authentication, installing dependencies or reporting issues. This guide will help you with some tips for writing your scripts!

Don't Log Sensitive Information!

Before moving any further into the world of automation, you need to review your logs and make sure that you do not output sensitive data such as API keys, credentials or any information that can compromise your system. As soon as you start using Bitbucket Pipelines to run your scripts the logs will be stored and readable by anyone who has access to your repository.

Categories: Java

OSGi Dependency Injection

Javalobby Syndicated Feed - Thu, 01-Jun-17 21:01

OSGi follows a standard service model paradigm. This is required because Java shows how hard it is to write collaboratively with only class sharing. The standard solution in Java is to use factories that use dynamic class loading and statics. For example, if you want a Factory, you call the static factory method Factory.newInstance(). Behind that façade, the newInstance methods try every class loader trick to create an instance of an implementation subclass of the BuilderFactory class.

The solution to all these issues is simply the OSGi service registry. A bundle can create an object and register it with the OSGi service registry under one or more interfaces. Other bundles can go to the registry and list all objects that are registered under a specific interface or class. For example, a bundle provides an implementation of the Builder. When it gets started, it creates an instance of its BuilderFactoryImpl class and registers it with the registry under the BuilderFactory class. A bundle that needs a BuilderFactory can go to the registry and ask for all available services with the BuilderFactory class. Even better, a bundle can wait for a specific service to appear and then get a callback.

Categories: Java

Why Learn Kotlin? [Infographic]

Javalobby Syndicated Feed - Thu, 01-Jun-17 09:01

After the recent Google I/O announcement about Kotlin being an official language for Android, Kotlin has gained popularity amongst developer communities and tech magazines.

Here's an infographic to let you know a bit more about Kotlin, outlining what is it, why learn it and its exciting prospects for the future.

Categories: Java

Thread Slivers eBook at Amazon

Syndicate content