Complex data processing and structure transformation

Posted by magic123 on Thu, 30 Dec 2021 15:42:22 +0100

9.1 Any ↔ character string

When developing data applications, most of the data will not be generated in real time by JavaScript or user operations, but will be extracted directly from the data storage facilities of the server, and then transmitted to the client through network protocol for display.

In this case, we can first introduce an aside. Since we know that most of the data used by the front end needs to be transmitted from the server to the front end through the network protocol, such a transmission process is the encoding and de encoding of abstract content. Moreover, in computer science, communication protocols basically carry data structures based on string (or binary), that is, in a communication architecture between server and client, various data structures will need to be converted into string first, and then converted into the original data structure in the same way after reaching the other end through the network transmission process.

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-pwm9x8yj-1640653798159)( https://user-gold-cdn.xitu.io/2018/5/28/163a4bdd7c3eb81f?w=861&h=261&f=png&s=24905 )]

9.1.1 JSON

JSON, fully known as JavaScript Object Notation, is one of the most popular network data transmission formats. Compared with CSV (comma separated values), XML (Extensible Markup Language) and other formatting data transmission formats with a longer history, JSON also has the characteristics of strong readability (fully in line with JavaScript standard), format insensitivity and lightweight.

{
  "name": "Chaoyang Gan",
  "nickname": "iwillwen"
}

JSON is a subset of the JavaScript language standard, which can run directly in the JavaScript engine. Of course, because the JavaScript language itself has the possibility of being attacked, it cannot be directly run as a piece of JavaScript code when parsing JSON data content.

The JavaScript engine provides an eval function to run a piece of JavaScript code, so if a piece of JSON data content is absolutely safe, you can use the eval function as a JSON parser.

const jsonStr = `{
  "name": "Chaoyang Gan",
  "nickname": "iwillwen"
}`

eval('var me = ' + jsonStr)

console.log(me.name) //=> Chaoyang Gan

However, if the JSON data to be parsed cannot be guaranteed to be safe and can even be maliciously tampered with (through man in the middle hijacking, XSS attack, etc.), there will be a very unsafe situation, which will lead to the theft of users' private information.

const somethingImportant = 'some secret'

const jsonStr = `{
  "attack": (function(){
    alert(somethingImportant)
  })()
}`

eval('var me = ' + jsonStr) //=> some secret

To avoid this situation, we must use JSON provided in modern JavaScript engines or other trusted JSON Parse function for decoding and JSON The stringify function.

JSON.parse(`{
  "attack": (function(){
    alert(somethingImportant)
  })()
}`) //=> SyntaxError: Unexpected token ( in JSON

To get back to business, generally speaking, we can call the process of converting non string data into string through some algorithm serialization (string is also an ordered sequence), and JSON format is one of the most popular serialization methods at present.

const jsonStr = JSON.stringify({
  name: 'Chaoyang Gan',
  nickname: 'iwillwen'
})

console.log(jsonStr) //=> {"name":"Chaoyang Gan","nickname":"iwillwen"}

9.1. 2 direct conversion

The advantage of JSON format is to convert data with uncertain structure into string format, but it will also force unnecessary content, For example, the boundary characters of JSON (such as ", {}). When the target data type to be converted is determined and the receiver who parses the serialized string data is controllable, you can choose to directly type the data.

value type

In JavaScript, all objects will have a toString method by default. For numeric types, you can directly use this method to convert to string types.

const n1 = 1
const n2 = 1.2

const s1 = n1.toString()
const s2 = n2.toString()

console.log(s1, typeof s1) //=> 1 string
console.log(s2, typeof s2) //=> 1.2 string

In addition to directly converting numerical values into strings, we often need to implement a method to fix the value after the decimal point of the data type within a length range, such as 5 - > 5.00 and 3.1415 - > 3.14. This is mainly used for the display of tables and charts. 3.1415 the required 3.14 can be obtained through numerical calculation, but 5 can not directly obtain 5.00 through calculation. Because JavaScript does not distinguish between integer and non integer values like other languages, it provides a method number toFixed. This method accepts a numeric parameter, that is, the number of digits reserved after the decimal point. Generally speaking, this parameter needs to be a non negative integer value. Of course, if a non integer value is passed in, this method will also automatically round it for calculation.

const int = 5
const pi = Math.PI //=>3.141592653589793 (approximately equal to)

console.log(int.toFixed(2)) //=> '5.00'
console.log(pi.toFixed(2)) //=> '3.14'
console.log(int.toFixed(pi)) //=> '5.000'

After conversion to a string, you can also convert the value stored as a string to integer or floating point through parseInt and parseFloat.

console.log(parseInt('5.00')) //=> 5
console.log(parseFloat('3.14')) //=> 3.14

Boolean (logical)

Boolean, that is, true and false (fortunately, there is no intermediate state in JavaScript), is expressed as true and false in JavaScript. Obviously, these two values each have a meaning expressed in English words, so we can easily convert them.

console.log(true.toString()) //=> 'true'console.log(false.toString()) //=> 'false'

However, it is not so simple to convert it to Boolean, because JavaScript does not directly provide functions such as parseBoolean, and as a weakly typed language, JavaScript also has many puzzling "operations" when making some judgments.

true == 'true' //=> falsefalse == 'false' //=> falsetrue == 1 //=> truefalse == 0 //=> true

Therefore, generally speaking, we can use strong type judgment = = = to judge whether a string is "true" and false if not.

function parseBoolean(string) {  return string === 'true'}console.log(parseBoolean('true')) //=> trueconsole.log(parseBoolean('false')) //=> false

array

In fact, we have already touched the split method in string in Section 2, which is used to divide a string into an array with the specified string as the separator.

const str = '1,2,3,4,5'const arr = str.split(',')console.log(arr) //=> [ 1, 2, 3, 4, 5 ]

Correspondingly, arrays can also be combined into a string, using array Join method.

const arr = [ 1, 2, 3, 4, 5 ]console.log(arr.join()) //=> 1,2,3,4,5console.log(arr.join('#')) //=> 1#2#3#4#5

9.2 object ↔ array

When we introduced the object literal in Section 5, we mentioned that the array in JavaScript is actually a special object literal, so in terms of dependency, the array should be a subset of the object literal [the external chain image transfer fails, and the source station may have an anti chain stealing mechanism. It is recommended to save the image and upload it directly (img-vvn0baig-1640653798172)( https://juejin.im/equation?tex=Array%20%5Csubseteq%20Object )].

But why do we still mention the conversion between objects and arrays? Suppose we need to display the attributes in an object literal in the form of a list:

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-e35occz8-1640653798174)( https://user-gold-cdn.xitu.io/2018/5/28/163a4bdd78d715d4?w=911&h=191&f=png&s=25744 )]

Although various frameworks have relevant functions or tools to meet this requirement, in order to better understand the differences between data structures and their applications, we still need to understand how to convert data formats.

An object is provided in JavaScript The keys () function can extract all attribute keys of the object and represent them in the form of an array.

const object = {  "name": "Chaoyang Gan",  "title": "Engineer",  "subject": "Maths"}const keys = Object.keys(object)console.log(keys) //=> ["name", "title", "subject"]

After getting the attribute key array of the target object, cooperate with the array map method can extract the value corresponding to each attribute key.

const list = keys.map(key => {  return {    key, value: object[key]  }})console.log(list)//=> [// {key: "name", value: "Chaoyang Gan"},// {key: "title", value: "Engineer"},// {key: "subject", value: "Maths"}// ]

Of course, we can also use arrays to represent objects in the second layer.

const pairs = keys.map(key => {
  return [ key, object[key] ]
})

console.log(pairs)
// => [
// ["name", "Chaoyang Gan"],
// ["title", "Engineer"],
// ["subject", "Maths"]
// ]

Similarly, we can also use the. Provided in Lodash The toPairs method converts an object into an array expressed as a key value pair with two elements.

const pairs = _.toPairs(object)

After the conversion from an object to an array is completed, it is natural to need a method to reverse it. You can directly use the. Method provided in Lodash fromPairs.

const object = _.fromPairs(pairs)
console.log(object)
// => {
// name: "Chaoyang Gan",
// title: "Engineer",
// subject: "Maths"
// }

In fact, we used in Section 5 The groupBy function is also a method to convert an array into an object, but it is more to lexicalize the array according to a field or a transformation result, rather than simply convert it.

The principle we need to make clear is that the starting point and purpose of data conversion are for service needs, rather than simply converting them into data structures. Before thinking about how to process data, we should first clarify what kind of data form the target needs. Whether you need an array with values as elements (such as input and output values of artificial neural network), an array with objects as element types for table display (each object element represents a row in the table), or a data box object stored in columns (such as commonly used in ECharts framework).

// Input data for ANN
const xorArray = [ 1, 0, 0, 1, 1, 0, 1 ]

// Row-base dataset
const rDataset = [
  { name: "iwillwen", gender: "male" },
  { name: "rrrruu", gender: "female" }
]

// Column-base dataset
const cDataset = {
  name: [ "iwillwen", "rrrruu" ],
  gender: [ "male", "female" ]
}

Summary

In this section, we learned about the mutual conversion of strings, objects and arrays. These are common and simple data conversion requirements and methods, which are generally used for the conversion steps in the process of data preprocessing and use.

exercises

  1. We have introduced two array formats that can store an object information. Please implement their inverse conversion processes fromList (for arrays with {key: "key", value: "value"}) and fromPairs respectively.
  2. Please implement the conversion process between row base dataset and column base dataset respectively.

Topics: Javascript Front-end Network Protocol