Logo

dev-resources.site

for different kinds of informations.

Something could double the development efficiency of Java programmers

Published at
9/6/2024
Categories
java
programming
devops
esproc
Author
esproc_spl
Categories
4 categories in total
java
open
programming
open
devops
open
esproc
open
Author
10 person written this
esproc_spl
open
Something could double the development efficiency of Java programmers

Computing dilemma in the application
Development and Framework, which should be given the higher priority?
Java is the most commonly used programming language in application development. But writing code to process data in Java isn’t simple. For example, below is the Java code for performing grouping & aggregation on two fields:

Map<Integer, Map<String, Double>> summary = new HashMap<>();
    for (Order order : orders) {
        int year = order.orderDate.getYear();
        String sellerId = order.sellerId;
        double amount = order.amount;
        Map<String, Double> salesMap = summary.get(year);
        if (salesMap == null) {
            salesMap = new HashMap<>();
            summary.put(year, salesMap);
        }
        Double totalAmount = salesMap.get(sellerId);
        if (totalAmount == null) {
            totalAmount = 0.0;
        }
        salesMap.put(sellerId, totalAmount + amount);
    }
    for (Map.Entry<Integer, Map<String, Double>> entry : summary.entrySet()) {
        int year = entry.getKey();
        Map<String, Double> salesMap = entry.getValue();
        System.out.println("Year: " + year);
        for (Map.Entry<String, Double> salesEntry : salesMap.entrySet()) {
            String sellerId = salesEntry.getKey();
            double totalAmount = salesEntry.getValue();
            System.out.println("  Seller ID: " + sellerId + ", Total Amount: " + totalAmount);
        }
    }
Enter fullscreen mode Exit fullscreen mode

By contrast, the SQL counterpart is much simpler. One GROUP BY clause is enough to close the computation.

SELECT year(orderdate),sellerid,sum(amount) FROM orders GROUP BY year(orderDate),sellerid

Indeed, early applications worked with the collaboration of Java and SQL. The business process was implemented in Java at the application side, and data was processed in SQL in the backend database. The framework was difficult to expand and migrate due to database limitations. This was very unfriendly to the contemporary applications. Moreover, on many occasions SQL was unavailable when there were no databases or cross-database computations were involved.

In view of this, later many applications began to adopt a fully Java-based framework, where databases only do simple read and write operations and business process and data processing are implemented in Java at the application side, particularly when microservices emerged. This way the application is decoupled from databases and gets good scalability and migratability, which helps gain framework advantages while having to face the Java development complexity mentioned previously.

It seems that we can only focus on one aspect – development or framework. To enjoy advantages of the Java framework, one must endure difficulty development; and to use SQL, one need to tolerate shortcomings of the framework. This creates a dilemma.

Then what can we do?
What about enhancing Java's data processing capabilities? This not only avoids SQL problems, but also overcomes Java shortcomings.

Actually, Java Stream/Kotlin/Scala are all trying to do so.

Stream

The Stream introduced in Java 8 added many data processing methods. Here is the Stream code for implementing the above computation:

Map<Integer, Map<String, Double>> summary = orders.stream()
            .collect(Collectors.groupingBy(
                    order -> order.orderDate.getYear(),
                    Collectors.groupingBy(
                            order -> order.sellerId,
                            Collectors.summingDouble(order -> order.amount)
                    )
            ));

    summary.forEach((year, salesMap) -> {
        System.out.println("Year: " + year);
        salesMap.forEach((sellerId, totalAmount) -> {
            System.out.println("  Seller ID: " + sellerId + ", Total Amount: " + totalAmount);
        });
    });
Enter fullscreen mode Exit fullscreen mode

Stream indeed simplifies the code in some degree. But overall, it is still cumbersome and far less concise than SQL.

Kotlin

Kotlin, which claimed to be more powerful, improved furtherly:

val summary = orders
        .groupBy { it.orderDate.year }
        .mapValues { yearGroup ->
            yearGroup.value
                .groupBy { it.sellerId }
                .mapValues { sellerGroup ->
                    sellerGroup.value.sumOf { it.amount }
                }
        }
    summary.forEach { (year, salesMap) ->
        println("Year: $year")
        salesMap.forEach { (sellerId, totalAmount) ->
            println("  Seller ID: $sellerId, Total Amount: $totalAmount")
        }
    }
Enter fullscreen mode Exit fullscreen mode

The Kotlin code is simpler, but the improvement is limited. There is still a big gap compared with SQL.

Scala

Then there was Scala:

val summary = orders
        .groupBy(order => order.orderDate.getYear)
        .mapValues(yearGroup =>
            yearGroup
                .groupBy(_.sellerId)
                .mapValues(sellerGroup => sellerGroup.map(_.amount).sum)
    )
    summary.foreach { case (year, salesMap) =>
        println(s"Year: $year")
        salesMap.foreach { case (sellerId, totalAmount) =>
            println(s"  Seller ID: $sellerId, Total Amount: $totalAmount")
        }
    }
Enter fullscreen mode Exit fullscreen mode

Scala is a bit simpler than Kotlin, but still cannot be compared to SQL. In addition, Scala is too heavy and inconvenient to use.

In fact, these technologies are on the right path, though they are not perfect.

Compiled languages are non-hot-swappable
Additionally, Java, being a compiled language, lacks support for hot swapping. Modifying code necessitates recompilation and redeployment, often requiring service restarts. This results in a suboptimal experience when facing frequent changes in requirements. In contrast, SQL has no problem in this regard.

Image description
Java development is complicated, and there are also shortcomings in the framework. SQL has difficulty in meeting the requirements for framework. The dilemma is difficult to solve. Is there any other way?

The ultimate solution – esProc SPL
esProc SPL is a data processing language developed purely in Java. It has simple development and a flexible framework.

Concise syntax
Let’s review the Java implementations for the above grouping and aggregation operation:

Image description
Compare with the Java code, the SPL code is much more concise:

Orders.groups(year(orderdate),sellerid;sum(amount))
Enter fullscreen mode Exit fullscreen mode

It is as simple as the SQL implementation:

SELECT year(orderdate),sellerid,sum(amount) FROM orders GROUP BY year(orderDate),sellerid
Enter fullscreen mode Exit fullscreen mode

In fact, SPL code is often simpler than its SQL counterpart. With support for order-based and procedural computations, SPL is better at performing complex computations. Consider this example: compute the maximum number of consecutive rising days of a stock. SQL needs the following three-layer nested statement, which is difficult to understand, not to mention write.

select max(continuousDays)-1
  from (select count(*) continuousDays
    from (select sum(changeSign) over(order by tradeDate) unRiseDays
       from (select tradeDate,
          case when closePrice>lag(closePrice) over(order by tradeDate)
          then 0 else 1 end changeSign
          from stock) )
group by unRiseDays)
Enter fullscreen mode Exit fullscreen mode

SPL implements the computation with just one line of code. This is even much simpler than SQL code, not to mention the Java code.

stock.sort(tradeDate).group@i(price&lt;price[-1]).max(~.len())
Enter fullscreen mode Exit fullscreen mode

Comprehensive, independent computing capability
SPL has table sequence – the specialized structured data object, and offers a rich computing class library based on table sequences to handle a variety of computations, including the commonly seen filtering, grouping, sorting, distinct and join, as shown below:

Orders.sort(Amount) // Sorting
Orders.select(Amount*Quantity>3000 && like(Client,"*S*")) // Filtering
Orders.groups(Client; sum(Amount)) // Grouping
Orders.id(Client) // Distinct
join(Orders:o,SellerId ; Employees:e,EId) // Join
……
Enter fullscreen mode Exit fullscreen mode

More importantly, the SPL computing capability is independent of databases; it can function even without a database, which is unlike the ORM technology that requires translation into SQL for execution.

Efficient and easy to use IDE
Besides concise syntax, SPL also has a comprehensive development environment offering debugging functionalities, such as “Step over” and “Set breakpoint”, and very debugging-friendly WYSIWYG result viewing panel that lets users check result for each step in real time.

Image description
Support for large-scale data computing
SPL supports processing large-scale data that can or cannot fit into the memory.

In-memory computation:

Image description
External memory computation:

Image description
We can see that the SPL code of implementing an external memory computation and that of implementing an in-memory computation is basically the same, without extra computational load.

It is easy to implement parallelism in SPL. We just need to add @m option to the serial computing code. This is far simpler than the corresponding Java method.

Image description
Seamless integration into Java applications
SPL is developed in Java, so it can work by embedding its JARs in the Java application. And the application executes or invokes the SPL script via the standard JDBC. This makes SPL very lightweight, and it can even run on Android.

Call SPL code through JDBC:

Class.forName("com.esproc.jdbc.InternalDriver");
    con= DriverManager.getConnection("jdbc:esproc:local://");
    st =con.prepareCall("call SplScript(?)");
    st.setObject(1, "A");
    st.execute();
    ResultSet rs = st.getResultSet();
    ResultSetMetaData rsmd = rs.getMetaData();
Enter fullscreen mode Exit fullscreen mode

As it is lightweight and integration-friendly, SPL can be seamlessly integrated into mainstream Java frameworks, especially suitable for serving as a computing engine within microservice architectures.

Highly open framework
SPL’s great openness enables it to directly connect to various types of data sources and perform real-time mixed computations, making it easy to handle computing scenarios where databases are unavailable or multiple/diverse databases are involved.

Image description
Regardless of the data source, SPL can read data from it and perform the mixed computation as long as it is accessible. Database and database, RESTful and file, JSON and database, anything is fine.

Databases:

Image description
RESTful and file:

Image description
JSON and database:

Image description
Interpreted execution and hot-swapping
SPL is an interpreted language that inherently supports hot swapping while power remains switched on. Modified code takes effect in real-time without requiring service restarts. This makes SPL well adapt to dynamic data processing requirements.

Image description
This hot—swapping capability enables independent computing modules with separate management, maintenance and operation, creating more flexible and convenient uses.

SPL can significantly increase Java programmers’ development efficiency while achieving framework advantages. It combines merits of both Java and SQL, and further simplifies code and elevates performance.

SPL open source address

esproc Article's
30 articles in total
Favicon
Add records that meet the criteria before each group after grouping :From SQL to SPL
Favicon
Multi combination condition grouping and aggregation #eg93
Favicon
Split a Huge CSV File into Multiple Smaller CSV Files #eg69
Favicon
Group & Summarize a CSV File #eg68
Favicon
Getting positions of members according to primary key values #eg58
Favicon
Getting members according to primary key values #eg63
Favicon
How to Access Databases using One SQL Statement #eg71
Favicon
Filter a CSV file and re-arrange it by category #eg60
Favicon
Getting positions of members based on a specified condition #eg46
Favicon
Convert Each Whites-space-separated Text Block into a Row #eg62
Favicon
Perform Distinct on Ordered Numbers in a Text File #eg61
Favicon
Parse a csv file having a primary-sub tables structure #eg41
Favicon
Convert CSV Data into Multilevel JSON #eg56
Favicon
Add a compute column to a csv file #eg40
Favicon
SQL, in each group modify the null value of a specified column as its neighboring value #eg43
Favicon
Get the whole group where at least one member meets the specified condition #eg36
Favicon
Parse a csv file where field values are enclosed by quotation marks and contain carriage return #eg35
Favicon
Replace Duplicate Digits in Every 9-digit Number in a Text File with Non-duplicate Ones #eg52
Favicon
Reverse Rows in a Text File #eg51
Favicon
The Difference between Each Value in a Certain Column and Its Previous One and Display Result
Favicon
Java, perform COUNT on each group of a large csv file #eg33
Favicon
SQL, extract unique values of JSON format field from each group #eg42
Favicon
Multi-condition filtering #eg48
Favicon
Getting members based on a specified condition #47
Favicon
Read specified columns from a csv file #eg44
Favicon
Something could double the development efficiency of Java programmers
Favicon
Java, fill each row having a null value in a csv file with values in the directly previous row #eg32
Favicon
To Index Data is To Sort Data
Favicon
Clear duplicate lines and lines having missing values from a csv file #eg24
Favicon
SQL, Set different flags for different groups according to whether there are duplicate values #eg19

Featured ones: