Handling Large Integers in JavaScript

Why we should care: JavaScript’s unique Number type

Handling large integers in JavaScript requires care. JavaScript treats numeric variables as a single type called Number. Unlike other programming languages, JavaScript does not divide numbers into real and integer types such as float and int or storage sizes such as short, int, and long. JavaScript’s Number type seems to use the IEEE-754 64-bit floating point number representation. This property is very convenient for most casual calculations. However, handling large integers requires slightly different care than other programming languages. Especially in environments like competitive programming or coding tests, you may fail.

Read More

Migrate to useEffect

TL;DR

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import { useRef, useEffect } from 'react'

const mounted = useRef(false)

useEffect(() => {
if (mounted.current) {
mounted.current = true;
// componentDidMount part
} else {
// componentDidUpdate part
}
return () => {
// componentWillUnmount part
}
}, Array_of_subscribed_props_and_states)

Compare Two Tables in SQL

You may want to compare the results processed while you develop and enhance data processing tasks. For example, you might want to see if the table from improved query are the same as the one of the existing query. If the tables are different, you might want to know how many rows are different and which field is different. In other words, you might want to compare two tables, such as the Linux diff command.

You might hesitate to compare two tables row by row if you feel the table is so big. But, don’t worry. You can make the cluster of servers compare the tables by simple SQL statement.

Read More

JavaScript Pitfalls & Tips: 2D Array, Matrix

Sometimes, you need to create and manipulate a two-dimensional (2D) array or a matrix. In JavaScript, an array of arrays can be used as a 2D array.

1
2
3
4
5
6
7
const mat = [
[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11],
[12, 13, 14, 15]
];
console.log(mat[1][2]); // gives 6

How do you implement if you initialize m x n matrix with zeros? If you know about Array.prototype.fill(), you may be tempted to initialize m x n array like Array(m).fill(Array(n).fill(0)), but this does not work properly. Since the Array(n).fill(0) creates just one array of size n and all the elements of the first array, that represents rows, point to the same, the second array as follows:

When you got to know Array.prototype.fill() method, you may be tempted to initialize m x n array like Array(m).fill(Array(n).fill(0)), but this does not work properly. Since the Array(n).fill(0) creates just one array of size n and all the elements of the first array, that represents rows, point to the same, second array as follows:

Read More

SQL Window Functions: row_number, rank, dense_rank

SQL Window functions are the functions related to the ordering of elements in each given partition. In the last post, I have introduced an example of getting top-3 elements for each partition using the rank() function.

row_number(), rank() and dense_rank() are SQL Window functions that number each element in ordering. They look similar. In fact, the results of the functions are the same if all the values of the ordering column are unique. But, the results of each function become different if there are the tie values, that is, the same elements in order.

Read More

SQL Window Functions: Top-k Elements Example

SQL Window functions are similar to aggregate functions, such as count(), sum() or average(), but has different usage. Window functions are related to ordering like rank(), row_number(), while aggregate functions are related to summary of set of values like count(), sum(). SQL Window functions specification is ISO/ANSI standard since 2003. You can use SQL window functions on HiveQL, MySQL, Transact-SQL as well as Spark SQL.

In processing data, there are several cases to split data into groups or partitions and get representative values for each group. Someone may solve this problem by classifying data by making keys first and compute values for each key. You may use map() and reduce() in MapReduce framework, or use map() and reduceByKey() or aggregate() in Spark RDD API.

If the argument function of a reduce() is the add (_ + _), the query can be written as sum() in SQL. If it is the increment(++), it can be transformed to count() in SQL. But, if the function of the reduce() is a something like ‘top-10 largest values’, it would be vague to find correspondent SQL statements or functions. And some data engineers or scientists may think this query cannot be written in SQL statements, and decide to code in MapReduce or Spark.

In this post, I will show an example to solve top-k frequent elements problem using the rank() function which is one of SQL window functions.

Read More