Friday, July 31, 2020

JVM optimization for java 8

Starting with Java 8, the Metaspace replaces the PermGen. 
So let's discuss JVM performance tuning in the context of JDK9+. 

Java 8 JVM model

The java 8 JVM model (in the light of Serial Garbage Collection model) looks like the following:

JVM


Heap is a dedicated memory space for the JVM objects.

The Heap is divided into 2 parts, Young Generation and Old Generation. 

Young Generation heap is further divided into Eden space and 2 survivor spaces. 

The metadata information of the JVM is stored in a native memory called "MetaSpace" (They are previously stored in PermGen space). This MetaSpace memory region is not a contiguous Java Heap memory. This space is used by JVM for Garbage collection, auto tuning, concurrent de-allocation of metadata.

The Young Generation heap stores the short lived java objects, while the Old Generation heap stores the long lived java objects. Most of the java objects are short lived. Iterator objects are example of short lived objects with lifespan of a single loop. Some objects live long, the object created in public static main() could live until the program exits.

A java object's life cycle starts with keyword new. It is referenced by other objects and it references other objects to get the work done. Once there is no reference to the object, it can be garbage collected. Once garbaged collected, the java object no longer exists in heap. The majority of the java objects dies in Eden space. A few java objects die in Old generation heap.

Garbage collection occurs in each generation heap when the generation fills up. 

When the Eden space fills up, it causes a minor garbage collection in which only a few live objects in young generation are saved into survivor space (no garbage collection happens in old generation heap). The costs of such collections are small, a young generation full of dead objects is garbage collected very quickly, because only few live objects are copied to survivor space, then the Eden space is marked clean. 

During each minor garbage collection, some fraction of the surviving objects from the young generation are moved to the old generation. Eventually, the old generation will fill up and must be collected, resulting in a major garbage collection, in which the entire heap is garbage collected. Major collections usually last much longer than minor collections because a significantly larger number of objects are involved.

The vassal from young generation to old generation are the two survivor spaces in young generation heap. Most objects are initially allocated in eden. One survivor space is empty at any time, and serves as the destination of any live objects in eden; the other survivor space is the destination during the next copying collection. Objects are copied between survivor spaces in this way until they are old enough to be tenured (copied to the old generation).

JVM performance tuning strategies


This is probably the most import picture of understanding JVM optimization. Now we can understand how these parameters control JVM performance:
  •  -XX:MinHeapFreeRatio=<minimum>
  • -XX:MaxHeapFreeRatio=<maximum>
  • -Xms<min>
  • -Xmx<max>
  • -XX:NewRation=3
  • -XX:SurvivorRatio=6

The above JVM model has a few basic aspects. 

First of all, the total heap size the JVM controls. This decide how much resources are allocated to this particular JVM. More memory the JVM has, more capable it is. The total heap size is bounded below by -Xms<min> and above by -Xmx<max>. JVM shrinks and expands heap size between -Xms and -Xmx, reserve some memory to itself in case Young Generation or Old Generation need reinforcement.  JVM decides when to re-enforce a generation according to the ratio of free space to live objects. This target range is set as a percentage by the parameters -XX:MinHeapFreeRatio=<minimum> and -XX:MaxHeapFreeRatio=<maximum>. For example, for range 40% to 70%, if the percent of free space in a generation falls below 40%, then JVM will use the reserved memory to expand the generation to maintain 40% free space. Similarly, if the free space of a generation exceeds 70%, then the JVM will take some space away from the generation as JVM reservation. Setting -Xms and -Xmx to the same value increases predictability, however, the JVM is then unable to compensate in bad times. 

The second most important aspects of JVM model is the ratio of heap dedicated to the young generation and old generation. The bigger a generation, the less often garbage collections occur. XX:NewRatio controls this aspects. For example, setting -XX:NewRatio=3 means that the ratio between the young and old generation is 1:3. In other words, the combined size of the eden and survivor spaces will be one-fourth of the total heap size. Increase the ratio, for example, means bigger young generation and smaller old generation, which implies less minor GC and more full GC. 

Guess what does -XX:SurvivorRatio=6 mean? It sets the ratio between eden and a survivor space to 1:6. In other words, each survivor space will be one-sixth the size of eden, and thus one-eighth the size of the young generation (not one-seventh, because there are two survivor spaces). If survivor spaces are too small, copying collection overflows directly into the old generation. If survivor spaces are too large, they will be uselessly empty. 

The above generation based garbage collection model and parameters are the essence of JVM performance tuning. 

JVM Garbage Collector Options


The rest of this article covers the techniques JVM used to augment this generation based GC model with multi-cores.

Actually, JVM has 3 modes for generation based garbage collection. They can be selected with one of the following flags:

  1. -XX:+UseSerialGC

  2. -XX:+UseParallelGC

  3. -XX:+UseG1GC

The XX:+UseConcMarkSweepGC is not in this list, because it is replaced by -XX:+UseG1GC in JDK9 and now is practically obsolete. 

We can skip -XX:+UseSerialGC, all we have to know is, it tells JVM to use Serial Garbage Collector, which is what we have talked so far.

Parallel Garbage Collector


-XX:+UseParallelGC tells JVM to use Parallel Garbage Collector mode. While the serial garbage collector uses a single thread to perform all garbage collection work, the parallel garbage collector, by default, execute both minor and major collections with multiple threads. ParallelGC usually performs significantly better than the serialGC when more than two processors are available. The number of garbage collector threads can be controlled with the command-line option -XX:ParallelGCThreads=<N>. Because multiple garbage collector threads are participating in a minor collection, some fragmentation is possible due to promotions from the young generation to the old generation during the collection. Each garbage collection thread involved in a minor collection reserves a part of the old generation for promotions and the division of the available space into these "promotion buffers" can cause a fragmentation effect. Reducing the number of garbage collector threads and increasing the size of the old generation will reduce this fragmentation effect.

The ParallelGC allows automatic tuning by specifying specific behaviors instead of generation sizes and other low-level tuning details. 

-XX:MaxGCPauseMillis=<N> hints JVM that pause times of <N> milliseconds or less are desired.
Put emphasis on this goal could reduce the overall throughput of the application and the desired pause time goal may not be met.

-XX:GCTimeRatio=<N> sets the ratio of garbage collection time to application running time to 1/(1+<N>). This flag hints JVM to put emphasis on meeting the throughput goal. Again, this goal can harm the pause goal and may not be achieved.  

The ParallelGC has an implicit goal of minimizing the size of the heap set by -Xmx<N> as long as the other goals are being met.

There are other flags to control other aspects of parallel garbage collector. Generation size adjustments, for example, are controlled by 
XX:YoungGenerationSizeIncrement=<Y>  
-XX:TenuredGenerationSizeIncrement=<T> 
-XX:AdaptiveSizeDecrementScaleFactor=<D>

G1 Garbage Collector

G1 Garbage Collector is the default GC in JDK9. It has the shortest pause time among the 3 GC options.

  1. SerialGC has a smallest GC overhead, if the application has a small data set (up to approximately 100 MB), or it is running on a single core and there are no pause time requirements, then option -XX:+UseSerialGC is the best choice. 
  2. If peak application performance is the first priority and (b) there are no pause time requirements or pauses of 1 second or longer are acceptable, then select the parallel collector with -XX:+UseParallelGC. 
  3. If response time is more important than overall throughput and garbage collection pauses must be kept shorter than approximately 1 second, then select the concurrent collector with -XX:+UseG1GC.
Now let's study how G1 Garbage collector works and how it achieve short garbage collection pauses.

Recall that SerialGC divides the heap into 3 regions: eden space, survivor space and old generation space. SerialGC perform many minor garbage collections in eden space. After each minor GC, it saves the survivor objects into survivor space. Eventually fraction of the survivor space objects are saved into old generation space. Once the old generation space is slowly filled up, a stop of the world full garbage collection is performed across the whole heap.

G1GC also divide the heap into not 3 regions but 3 types of regions: eden regions, survivor regions and old generation regions. (4 types of regions if you like, 3 types plus un-allocated empty slots.) The region size depends on the heap size, 2000 regions (or empty slots) per heap is quite typical.

Young Collections


Young Collections

At the beginning, most of the regions are un-allocated empty slots, JVM starts to allocate some regions and create new objects in them, we call those regions eden regions. The short lived java objects quickly born, die and fill up these initially empty eden regions. After a certain number of eden regions are allocated, the JVM will perform minor garbage collections to copy survivor objects to a small number of allocated regions which we call survivor regions. The few objects survived many minor garbage collections, thus are copied from survivor regions into some allocated regions, which we call old generation regions. These process happened in many regions in the heap concurrently.

Young Collection + Concurrent Mark

As the heap ages, after numerous minor garbage collections, more and more survivors entered old generation regions. Eventually, the original empty heap are now full of old generation regions. It is time for the JVM to prepare the whole heap clean up. In the phase of Young Collection + Concurrent Mark, JVM scans the old generation regions in the heap to mark the live objects in them, concurrently. While the JVM perform the concurrent live objects mark, the application is running, the minor young garbage collection is continuing. The world does not stop, it is running as usual, maybe just a little bit slower than the Young collections phase.

Mixed Collections

Young Collection + Concurrent Mark is just a short phase, in no time, JVM finished marking the live objects in the old generation regions, the heap now entered the final phase of a heap life cycle -- mixed collections.

Mixed Collections


During this phase, the normal Eden regions -> survivor regions and survivor regions -> old generation regions copies continue as usual. However these normal minor Young generation GCs are now mixed with whole heap old generation GCs. Additionally, JVM compactly copies the live objects in most of the old generation regions into a small number of old generation regions, thus put more and more old generation regions back to empty slot. 

This process continues until most of the old generation regions are returned back to empty slot, the few old generation regions left are full of live objects and not worth further garbage collecting.

The heap is now reborn, JVM stops Mixed Collections phase, and starts Young Collections phase, the beginning of a new cycle. 

As you can see, in the 3 phases of heap life cycle, the regions are concurrently updated, there are always regions for creating new objects and there is no stop the world events happening. That is the reason G1GC has the lowest pause time among the 3 JVM GC options. 

Thursday, July 30, 2020

sql deadlock

Dead lock could happen when two sql transactions are waiting for a lock the other transaction hold in order to finish. Since neither transaction can finish, no lock is released. 

For example, transaction 1 need to lock table user, then table department

BEGIN TRANSACTION
update user set salary = 2000 where id = 10;
update department set user_id = 10 where dep_id = 555;
COMMIT TRANSACTION

at the same time transaction 2 need to lock table department, then user

BEGIN TRANSACTION

update department set user_id = 10 where dep_id = 555;
update user set salary = 2000 where id = 10;
COMMIT TRANSACTION

Imagine transaction 1 executed the first update, locked user table, transaction 2 executed the first update, locked department table. 
now when transaction 1 execute the second update, it need to lock table department, which transaction 2 currently holding. Transaction 2 release the lock on department table until it finish. But Transaction 2 can not make progress, because it need to hold lock of user table, which transaction 1 is holding.

This example is written in SQL Server syntax, but similar condition happen in other database as well. 

InnoDB uses automatic row-level locking. You can get deadlocks even in the case of transactions that just insert or delete a single row. That is because these operations are not really “atomic”; they automatically set locks on the (possibly several) index records of the row inserted or deleted. Table-level locks prevent concurrent updates to the table, avoiding deadlocks at the expense of less responsiveness for a busy system.

No database can possibly work around deadlock errors in general. Databases need to detect such kind of deadlock then terminate one of the transaction in order to break the deadlock. 

Mysql for example, can be configured to detect deadlock every 5 seconds.

Normally, you must write your applications so that they are always prepared to re-issue a transaction if it gets rolled back because of a deadlock. Keep transactions small and short in duration to make them less prone to collision. Commit transactions immediately after making a set of related changes to make them less prone to collision.

Wednesday, July 29, 2020

mysql pagenation and penalty of it

we usually use mysql select query to get a list of database entity:
select 'id', 'name' from client;

what happens if there are 1 million rows to query?

General strategy is don't query a table without where clause. Always add a filter

select 'id', 'name' from client where 'created' between '2012-03-11 00:00:00' and '2012-05-11 23:59:00';

if created is indexed, the above query will run fast and return a small number of rows.

Mysql technically provides a pagination function.

The following query will return the first 500 rows of the 1 million rows in client table.
select 'id', 'name' from client limit 500;

In order to get the second 500 rows, we can add an offset
select 'id', 'name' from client limit 500 offset 500;

next 500, we increase the offset
select 'id', 'name' from client limit 500 offset 1000;
...
The same query has a shortcut
select 'id', 'name' from client limit 500 offset 100000;
is equivalent to
select 'id', 'name' from client limit 1000000, 500;

This approach has a penalty, if we ran the query
explain select 'id', 'name' from client limit 100000, 500;

the result shows, the sql actually read 100500 rows by the primary key, then throw away the first 100000 rows. What a waste. The memory is occupied, lots of rows are read. It is ok for small offset, but bad for large offset. This kind of query should not happen in production, it has danger of draining the database resource.

Alternatively, we can avoid large offset query by redesign the UI. We allow user to access the first a few pages, the query has small offset anyway. If the user have to click into deeper pages, we'd better to redesign the data model. Redirect the user to a different page, where a different data model is used. Say, we can partition the table:
CREATE TABLE client (
id INT NOT NULL,
name VARCHAR(20) NOT NULL,
created DATETIME NOT NULL )
PARTITION BY RANGE( YEAR(created) )(
    PARTITION from_2013_or_less VALUES LESS THAN (2014),
    PARTITION from_2014 VALUES LESS THAN (2015),
    PARTITION from_2015 VALUES LESS THAN (2016),
    PARTITION from_2016_and_up VALUES LESS THAN MAXVALUE

For the first a few pages, we use
SELECT * FROM client PARTITION (from_2016_and_up) WHERE created >= '2016-01-01' limit N, 500;

Then for more pages we just use a different query
SELECT * FROM client PARTITION (from_2015) WHERE created between '2015-01-01 00:00:00' and '2015-12-31 23:59:00' limit N, 500;

If we makes more partitions, we can change to use more queries.

This way, we don't have to read a huge amount of rows, we just read from the relevant partition.

Another good question to ask is: why we need to present this large amount of data to the user? Should we spit the tables or add some filter to limit the total dataset, such as date range? Does the user really want to click lots of pages to find the information instead of refining the filtering criteria? If they want to do that, let's don't allow the random page jump, just provide prev and next (maybe 1 to 10 page jump) hyperlinks, that will reasonably encourage the user to think harder. If the amount of rows need to display have to be large, why not use reactive http client with reactive mysql db support? Let's use webflux to have the db entity stream to the client side, so that while user scroll down the page, new rows get loaded dynamically.



Tuesday, July 28, 2020

set Access-Control-Allow-Origin header in spring web

Your got a normal spring rest resources like the following,
    @GetMapping("/client")
    public Client getClient(@RequestParam(value = "id", defaultValue = "1") String id) {
        logger.debug("/client requested with parameter {}", id);
        return clientHandler.handle(id);
    } 
then your wrote a html file like the following,
demo>cat jquery.html
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
<script>
$(document).ready(function(){
  $("button").click(function(){
    $.get("http://localhost:8080/client?id=1", function(data, status){
      alert("Data: " + data + "\nStatus: " + status);
    });
  });
});
</script>
</head>
<body>

<button>Send an HTTP GET request to a page and get the result back</button>

</body>
</html>
demo>

Next you test it with chrome


hoops, what's the error mean?

Access to XMLHttpRequest at 'http://localhost:8080/client?id=1' from origin 'null' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.

Ajax has a same origin rule to prevent javascript to modify DOM across domain.
Since your html is located in a static file instead of served by a web server, the origin is null in the request issued by jquery.

To "fix" it and continue to test local, we need to relax the same origin rule. Spring allows developer to do that with an extra annotation. Modify your resource like the following:

import org.springframework.web.bind.annotation.CrossOrigin;
...
    @CrossOrigin
    @GetMapping("/client")
    public Client getClient(@RequestParam(value = "id", defaultValue = "1") String id) {
        logger.debug("/client requested with parameter {}", id);
        return clientHandler.handle(id);
    } 

restart the server, now, try that jquery click function, it now ok:



There is a dedicated article I wrote about http client and server in depth, more information can be found there.




How to cat multiline content into a file with shell

Sometimes we want to copy a paragraph then paste into or append into a file.
We can use a text editor, such as vi to do the copy paste.

It will be convenient to do it in command line with just cat and >.

Here is the example, the normal way is echo some thing followed by >. 
In case of multi-line, we need to use <<EOF to mark the end of the file.
EOF won't be a literal string that enters the final result, it is the marker of End Of File instead.

here is the example code:

demo>cat > demo.txt <<EOF
> adfd
> dfadfad
> adfafa
> EOF
demo>cat demo.txt
adfd
dfadfad
adfafa
demo>

The file content is now get updated with your strings.


Friday, July 17, 2020

How springboot logging works

Spring is an opinionated framework, which means it chooses a reasonable default for you if you don't choose it yourself.

What logging facility the springboot chooses for you by default?

The answer is slf4j + logback

When you set spring boot dependency in the pom

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>

That dependency depends on (thus brings in) spring-boot-starter-logging, which depends on sping-jcl (spring commons logging bridge), which is the logging facility springs framework use to do the log.

In the code, we just have to do the log as the following:
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.core.env.Environment;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import com.example.my.mydemo.handler.ClientHandler;
import com.example.my.mydemo.model.Client;

@RestController
public class ClientController {
    Logger logger = LoggerFactory.getLogger(ClientController.class);

    @Autowired
    Environment env;
    
    private final ClientHandler clientHandler;
    @Autowired
    public ClientController(ClientHandler clientHandler) {
        this.clientHandler = clientHandler;
    }
    
    @GetMapping("/client")
    public Client getClient(@RequestParam(value = "id", defaultValue = "1") String id) {
        logger.debug("/client requested with parameter {}", id);
        return clientHandler.handle(id);
    } 
    
    @GetMapping("/")
    public String test() {
        return env.getProperty("Greetings.Visit");
    }
    
    @Override
    public String toString() {
        return "ClientController, use id to lookup client information, url: http://localhost:8080/client?id=1";
    }

}

The slf4j is just an interface, the jar classes that implement these interfaces are log4j, which is in the class path of your maven dependencies. As you already guessed, spring did the default config for you.

mvn dependency:tree
...
[INFO] +- org.springframework.boot:spring-boot-starter-web:jar:2.3.0.RELEASE:compile
[INFO] |  +- org.springframework.boot:spring-boot-starter:jar:2.3.0.RELEASE:compile
[INFO] |  |  +- org.springframework.boot:spring-boot:jar:2.3.0.RELEASE:compile
[INFO] |  |  +- org.springframework.boot:spring-boot-autoconfigure:jar:2.3.0.RELEASE:compile
[INFO] |  |  +- org.springframework.boot:spring-boot-starter-logging:jar:2.3.0.RELEASE:compile
[INFO] |  |  |  +- ch.qos.logback:logback-classic:jar:1.2.3:compile
[INFO] |  |  |  |  \- ch.qos.logback:logback-core:jar:1.2.3:compile
[INFO] |  |  |  +- org.apache.logging.log4j:log4j-to-slf4j:jar:2.13.2:compile
[INFO] |  |  |  |  \- org.apache.logging.log4j:log4j-api:jar:2.13.2:compile
[INFO] |  |  |  \- org.slf4j:jul-to-slf4j:jar:1.7.30:compile
...

With these logging facilities in place, we just have to follow the spring's convention to configure the logging facility if you don't like the default. The convention is:
as long as you put a file named logback.xml or spring-logback.xml under directory src/main/resources, spring will use the file to configure logback.
Here is an example logback.xml content. By default, logback log level is INFO, the reconfiguration want to log DEBUG for class ClientController and JdbcTemplate. While the log of  ClientController is print to the console, which is the default, the JdbcTemplate is write to /logs.sql with dates postfix in the file name logs/sql.20-06-30_19.log, the file name changes with time, and the content looks like:
"2020-06-30 19:12:44,252 DEBUG[http-nio-8080-exec-1] o.s.j.c.JdbcTemplate - Executing prepared SQL statement [select name, email, id from Client where id = ?]"

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<include resource="org/springframework/boot/logging/logback/base.xml"/>
<appender name="MyFile" class="ch.qos.logback.core.rolling.RollingFileAppender">
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>
logs/sql.%d{yy-MM-dd_HH}.log
</fileNamePattern>
</rollingPolicy>
<encoder>
<pattern>
%date{ISO8601} %-5level[%thread] %logger{1} - %msg%n
</pattern>
</encoder>
</appender>
<logger name="com.example.my.mydemo.controller.ClientController" level="DEBUG"/>
<logger name="org.springframework.jdbc.core.JdbcTemplate" level="DEBUG">
<appender-ref ref="MyFile"/>
</logger>
</configuration>

Now setback and enjoy the springboot logging for free (almost).

FYI.
If you are a fan of lombok. The one line 
Logger logger = LoggerFactory.getLogger(ClientController.class);
can be replaced by lombok annotation @Slf4j
the annotation will generate one line of code for you
Logger log = LoggerFactory.getLogger(ClientController.class);

then you can use variable log in later code.

Actually, to use lombok you have to add maven dependency, then add one extra import line in java code, so it is just a stylish choice for some developers.
...
import com.example.my.mydemo.model.Client;
import lombok.extern.slf4j.Slf4j;

@RestController
@Slf4j
public class ClientController { 
...
    @GetMapping("/client")
    public Client getClient(@RequestParam(value = "id", defaultValue = "1") String id) {
        log.debug("/client requested with parameter {}"id);
        return clientHandler.handle(id);
    }
...
For more information about lombok, check out How Lombok save developer's time.

Thursday, July 16, 2020

How to re-install macos high Sierra

Sometimes, you just want to backup stuff in your old MacOs High Sierra and start over.

You saved all needed files on the google drive and ready for a new start.

Here is the steps to follow.


  1. shut down the laptop
  2. when the laptop start, press CMD + R during power up
  3. Once get into the restore mode, select Disk utilities and click continue 
  4. In the next screen select a disk images under "Disk Images" on the left, then click erase at the top.
  5. Once seeing "Erase process is complete, click Done to continue", just click Done
  6. Continue to do it until all the disk images under "Disk Images" are gone.
  7. Now select "Macintosh HD" under Internal on the left, the blue bar should show mostly free space. 
  8. We are done erasing old MacOS, it is time to start new.
  9. close Disk utilities window, return to restore mode main window.
  10. select "Reinstall MacOS"
  11. In the next window, you will notice the wifi connection is also grey out. Click the wifi icon and put in the wifi id and password.
  12. click continue, the wizard will connect to apple server in the cloud and guide you through exiting process of creating a new macOS, sit back and enjoy. (Don't forget to download the saved files from google drive once finish.)

Sunday, July 12, 2020

How Lombok saves developers' time

Nowadays, all the cool kids use lombok to make beautiful java code. So let's talk about it.

Project Lombok is a java library that automatically plugs into your editor and build tools to generate boilerplate code such as class constructors, field getter and setter, hashCode etc.

In order to use it, set lombok as one of the maven dependencies.

<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
</dependency>

With the dependency, a lombok.jar will be introduced in the project compilation classpath. The lombok.jar contains a file named /META-INF/services/javax.annotation.processing.Processor. When javac sees this file in a compilation classpath, it runs annotation processors defined there during compilation. As a result, the lombok annotations such as @Getter is replaced by java getter code blocks. As you can see, lombok is a jar that works at compilation time.

The javac can understand the lombok annotations doesn't mean your IDE can understand them. Eclipse, for example, need to have a the lombok.jar registered, so that it can use it to compile your code before running it. The Eclipse installation is easy, lombok.jar did that for you. You just have to download the jar, run it with
java -jar lombok.jar

the rest of the work is handled by the jar. It scan your computer to find the eclipse and install itself into your chosen eclipse programs.

Now you can write code such as
package com.example.my.mydemo.dao;
import lombok.Getter;
import lombok.Setter;

@Setter @Getter
public class Client {
    private String name;
    private String email;
    private long id;
}

With the lombok annotation @Getter @Setter, javac or eclipse treat them as equivalent to the following code

package com.example.my.mydemo.dao;
public class Client {
    private String name;
    private String email;
    private long id;
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public String getEmail() {
        return email;
    }
    public void setEmail(String email) {
        this.email = email;
    }
    public long getId() {
        return id;
    }
    public void setId(long id) {
        this.id = id;
    }
}

So, when other java code need to access getter and setters, it won't fail.
package com.example.my.mydemo.dao;

import org.springframework.jdbc.core.BeanPropertyRowMapper;
import org.springframework.jdbc.core.JdbcTemplate;

import com.example.my.mydemo.model.Client;

public class ClientDao {
    private final JdbcTemplate jdbcTemplate;
    private static final String CLIENTSELECT = "select name, email, id from "
            + "Client where id = ?";
    public ClientDao(JdbcTemplate jdbcTemplate) {
        this.jdbcTemplate = jdbcTemplate;
    }
    
    public Client getClient(String clientId) {
        Client client = jdbcTemplate.queryForObject(CLIENTSELECT, new Object[]{clientId},
                new BeanPropertyRowMapper<Client>(Client.class));
        return client;
    }

}

That is easy.

Even better, we can use @Data annotation to get many code for free, it is equivalent to @Getter @Setter @RequiredArgsConstructor @ToString @EqualsAndHashCode all together.


Tuesday, July 7, 2020

Top tools for full stack java developers

Full stack java developers wears the hat of both front end developers and back end developers. They just write the server/client code from start to end with css/html/js/java/sql/etc.

Let's take a peek into a typical full stack java developer's magic sack. It is just a peek, because a professional java developer's technical vocabulary requires a dictionary length file to enumerate.

Runtime

  • JDK -- When choosing java development kit (JDK), there are basically 2 options. Commercial version or free version. Oracle JDK LTS(Long Term Support) released every 3 years, which requires a commercial license. The latest oracle JDK 11 is not free. You got what you pay, oracle keep updating the JDK for bug fixes and security patches. Oracle OpenJDK is released every 6 months, it is free and open-source implementation of the java platform standard edition and it's been the official reference implementation of JAVA SE since version 7. For bug fixes and security patches we should upgrade oracle openJDK every 6 months. Besides oracle, other vendors such as Azul system also provides java SE 11 implementations. Their zulu 11 JVM, for example is java 11 openjdk builds. Their Zing JVM is Java Virtual Machine (JVM) and runtime platform for Java applications. Zing is compliant with the associated Java SE version standards. Zing's typical use cases are applications with requirements regarding low latency response time or huge heap sizes up to 20 TB by using its own pauseless garbage collection implementation (C4) and its own Just in time compiler implementation (Falcon). While Zulu is a general usage JVM which covers more general use cases as it is mainly openJDK with its Garbage collectors like CMS GC, G! GC, Parallel GC and the Hotspot JIT. Zulu is a branded version of OpenJDK (free to download and use without restrictions) with paid commercial support, while zing JVM charges per host.
  • NodeJS -- NodeJS JavaScript is an open source javascript runtime built on Chrome's V8 JavaScript engine. As Node. js is not the traditional programming language, but rather a runtime environment, it is easy to learn for both front and back-end developers. Node JS development has become a mature JS language and can be credited with having a large ecosystem. It has not just revolutionized backend development but also contributed in a big way to bringing performance to the front end.

IDE

  • Eclipse/IntelliJ IDEA with maven -- the most popular Java IDE are eclipse and Intellij IDEA. While eclipse is an open source free IDE, IntelliJ IDEA provides both free community version and charged ultimate version. All modern java developers use maven to manage java dependencies. 
  • Visual Studio Code with npm-- Visual Studio code is Microsoft open source IDE geared towards nodeJS development. It is the clear winner for full stack web developers. Atom used to be the top one IDE few years ago, now it is replaced by visual studio code. Visual Studio Code and Atom has similar features, but the fashion favors Visual Studio Code this year. Another popular IDE is WebStorm, a subset of IntelliJ IDEA, geared towards front end development. JetBrain's IDE provides unlicensed copy or licensed copy. It has tons of features and you get what you paid for, convenience and security. Other front end IDEs are text editors in their full glory. sublime text, for example, is the code editor used in coderpad, which has many short cuts to create code fast. It has tons of plugins you can find through Package Control, good for free-style editing of different languages. The npm to NodeJS is like maven to Java. When you download Node.js, you automatically get npm installed on your computer. The npm (Node Package Manager) is a command-line tool for interacting with online repository for open-source Node.js projects. 

Server

  • Apache -- Apache http webserver (httpd) is a free and open source program web server that runs 67% of all webservers in the world. Written in C language, it is fast, reliable, and secure. It is also adaptive to different environments using extensions and modules. For example, with mod_proxy, it can be configured to a multi-protocol proxy/gateway serve. A typical use case is to proxy requests to tomcat or nodejs, which need to be accessed from internet but don't have public IP. For another example, with mod_proxy_balancer, it can provide load balancing for all the supported protocols.
  • Tomcat -- Born out of the Apache Jakarta Project, Tomcat is an opensource application server written with java and designed to execute Java servlets and render web pages that use Java Server page coding. Tomcat can be configured in eclipse, intellij etc. to run the java servlets based code local during your development. It is also the embedded web server of springboot as one of the maven dependencies. While Tomcat is a java-phallic web server, apache http server is a general-purpose http server, which supports a number of advanced options that Tomcat doesn't. 
  • Jetty -- Tomcat is the most popular java servlet container/server, arguably the second popular servlet container/server is eclipse-foundation developed jetty. Due to its compactness and small footprint, Jetty is a great fit for constrained environments and for embedding in other products. 
  • GlassFish -- Ironically, tomcat and jetty were supposed to be java EE application servers. Though market selected them, neither Tomcat nor Jetty is technically a fully featured Java EE container. They lack support for many Java EE features and can not have the title of "java EE application servers". Oracle recommended 3 Java EE 8 Full Platform Compatible Implementations -- GlassFish, IBM WebSphere Application Server, wildfly. Tomcat and jetty are not in the list. GlassFish gets contributions from the same people who define Java EE standards. (Oracle has transferred Java EE to the Eclipse Foundation, and it is now called Jakarta EE after Java EE 8.) It’s the reference implementation Java EE application server that always support the latest Java EE features. 

Persistant Layer

  • MySql -- mysql is the most popular relational sql database. Its popularity is related to its adaptive to the major cloud providers. MySQL is free and open-source software under the terms of the GNU General Public License, and is also available under a variety of proprietary licenses. There are other relational sql database options such as mariadb, postgreSQL, oracle database. Relational database is a group of databases that stores structured information. The data written to the database has to match certain schema. In recent years, a new type of database, so called noSQL database began to gain popularity. The data written to the noSQL database can be free style, no schema restriction has to be put on the data. NoSQL databases come in a variety of types based on their data model. The main types are document, key-value, wide-column, and graph.
  • MongoDB -- MongoDB is the most popular noSQL database. MongoDB uses JSON-like documents with optional schemas. It is developed by MongoDB Inc. and licensed under the Server Side Public License. MongoDB stays at the CP edge of CAP graph. It is consistent, partition-tolerant, but not always available. Compare to relational databases, which lacks of partition-tolerant, MongoDB can be horizontally scale up without worrying about network partition. Data warehouses, data lakes store huge amount of structured and un-structured data, the storage have to be distributed among many VMs in different networks. Some servers unable to communicate to another servers across networks are unavoidable, therefore databases for the bigdata has to be partition-tolerant. That is key reason MongoDB are naturally selected by cloud.

Framework and Library

  • spring framework -- Spring is a powerful, lightweight framework used for application development using Java as a programming language. The Spring framework comprises of many modules such as core, beans, context, expression language, AOP, Aspects, Instrumentation, JDBC, ORM, OXM, JMS, Transaction, Web, Servlet, Struts etc. The root of spring framework is the IoC (Inversion of control) container. It receives metadata from either an XML file, Java annotations, or Java code. The container gets its instructions on what objects to instantiate, configure, and assemble from simple Plain Old Java Objects (POJO) by reading the configuration metadata provided.
  • Grails -- spring framework's closest competitor is grails framework, a powerful Groovy-based web application framework for the JVM. Both Grails and Spring Boot are "Full Stack Frameworks". "Groovy" is the primary reason why developers consider Grails over the simple and elegant Spring Boot. Groovy can be used as a scripting language for the Java platform. It is almost like a super version of Java which offers Java's enterprise capabilities. Groovy's use for scripting in the Jenkins CI/CD platform should help the JVM language maintain its popularity. However, as a programming language for JVM, I’d go with Kotlin or Scala for JVM.
  • Angular.js -- AngularJS is an open-source Front-end JavaScript framework. Its goal is to augment browser-based applications with Model–View–Controller (MVC) capability and reduce the amount of JavaScript needed to make web applications functional. Angular.js is currently the most popular Front-End javascript framework, the majority of the new front end stack is built on angularJS.
  • ReactJS -- The second popular Front-End javascript framework is ReactJS from facebook, ReactJS native is widely used for building cross-platform mobile apps.

Text Editor

  • sublime text -- popular text editor used by most front front end developers, it provides unlicensed and licensed versions. 
  • TextMate -- open source text editor which is handy when formatting language and markups such as xml, json, html.

Build and Deploy tool

  • Jenkins -- Jenkins is a free and open source automation server written in Java. It is used to continuously build and test software projects, enabling developers to set up a CI/CD environment. It use version control tools SVN, Git etc. to check out the code, it then run various maven commands such as mvn release:perform to build/test/release the artifacts to the repository. 
  • TeamCity -- an alternative CI/CD tool for Jenkins is TeamCity from JetBrains, the same company developing Intellij IDEA. It is commercial software and licensed under a proprietary license. Again, you get what you paid for.
  • Ansible -- free and open source automated deploy tool. With roles and playbooks, ansible allows the authorized user to execute tasks such as package deploy, OS update, server restart etc using remote SSH. A typical use case is login a set of servers and run red-hat native command yum to install rpms on redhat linux release, or use ansible-realm command package to detect the OS, then install the package accordingly. So far it is the most popular configuration management tool because it's easy to use. 
  • Grunt or Gulp -- Both Grunt and Gulp are Automated Task Runner on Node.js. Major Difference Between Gulp and Grunt Lies in How They Deal With Automation of Tasks Internally . Gulp uses node streams in memory for running different tasks and Grunt use intermediary files which are disk I/O operations for the same work. Memory vs. I/O operations, Gulp is clearly faster than Grunt. Gulp is a good choice if you prefer code over configuration, Gulp's stream style fluent api is cleaner than Gulp's configuration like api. Both tools are typically used to build, concat and minify javascript code for deployment.
  • Hubot -- open source tool written in CoffeeScript on Node.js, with out-of-box scripts and your own scripts, hubot can be customized to automate the code deployment with a simple slack command, email message, google home command etc.

VM Manager

  • Vagrant -- Vagrant is a free and open source tool for building and managing virtual machine environments in a single workflow. With Vagrant, developers can make local development environment as close to production environment as possible. 
  • Docker -- Where Docker relies on the host operating system, Vagrant includes the operating system within itself as part of the package. One big difference between Docker and Vagrant is that Docker containers run on Linux, but Vagrant files can contain any operating system.

Process Manager

  • PM2 -- PM2 is a production process manager for Node. js applications with a built-in load balancer. It allows you to keep applications alive forever, to reload them without downtime and to facilitate common system admin tasks. Starting an application in production mode is as easy as: $ pm2 start app.js.

Communication

  • G suite  -- with google drive, gmail, google calendar, etc. you keep in touch with your team mates.
  • jira -- project management
  • confluence -- share knowledge
  • bitbucket/github/gitflow/svn -- share versioned code
  • slack -- chat tool
  • zoom -- online conference tool
  • citrix -- Citrix is an application that allows you to securely connect to a virtual desktop, server, application, or roaming profile through a terminal (or other computer).
  • Big IP edge client -- vpn connection.  
  • nomachine -- free, cross-platform, serverless remot e desktop tool that lets you setup a remote desktop server on your computer using the NX video protocol.  
  • cisco phone -- phone

Why I stopped publishing blog posts as information provider

Now the AI can generate content. Does that mean the web publishing industry reaches the end? ChatGPT said: ChatGPT Not at all. While AI can ...