How do you get a string to a character array in JavaScript?

Question

How do you convert a string to a character array in JavaScript?

I'm thinking getting a string like "Hello world!" to the array
['H','e','l','l','o',' ','w','o','r','l','d','!']

Ray Foss · Accepted Answer · 2019-06-17 14:24:37Z

495

1

Note: This is not unicode compliant. "I💖U".split('') results in the 4 character array ["I", "�", "�", "u"] which can lead to dangerous bugs. See answers below for safe alternatives.

Just split it by an empty string.

var output = "Hello world!".split('');
console.log(output);

Expand snippet

See the String.prototype.split() MDN docs.

edited Jun 17 '19 at 14:24

Ray Foss

2,6521 gold badge19 silver badges23 bronze badges

answered Dec 28 '10 at 16:41

meder omuraliev

163k61 gold badges355 silver badges419 bronze badges

31

This doesn't take into account surrogate pairs. "𨭎".split('') results in ["�", "�"]. – hippietrail Feb 13 '15 at 18:15
59

See @hakatashi's answer elsewhere in this thread. Hopefully everyone sees this... DO NOT USE THIS METHOD, IT'S NOT UNICODE SAFE – i336_ Feb 5 '16 at 4:22
3

Bit late to the party. But why would someone ever want to make a array of a string? A string is already an array or am I wrong? "randomstring".length; //12 "randomstring"[2]; //"n" – Luigi van der Pal Dec 8 '16 at 11:19
4

@LuigivanderPal A string is not an array, but it is very similar. However, it is not similar to an array of characters. A string is similar to an array of 16-bit numbers, some of which represent characters and some of which represent half of a surrogate pair. For example, str.length does not tell you the number of characters in the string, since some characters take more space than others; str.length tells you the number of 16-bit numbers. – Theodore Norvell Apr 5 '19 at 13:00

add a comment |

Mark Amery · Accepted Answer · 2020-01-17 20:24:47Z

As hippietrail suggests, meder's answer can break surrogate pairs and misinterpret “characters.” For example:

// DO NOT USE THIS!
> '𝟘𝟙𝟚𝟛'.split('')
[ '�', '�', '�', '�', '�', '�', '�', '�' ]

I suggest using one of the following ES2015 features to correctly handle these character sequences.

Spread syntax (already answered by insertusernamehere)

> [...'𝟘𝟙𝟚𝟛']
[ '𝟘', '𝟙', '𝟚', '𝟛' ]

Array.from

> Array.from('𝟘𝟙𝟚𝟛')
[ '𝟘', '𝟙', '𝟚', '𝟛' ]

RegExp `u` flag

> '𝟘𝟙𝟚𝟛'.split(/(?=[\s\S])/u)
[ '𝟘', '𝟙', '𝟚', '𝟛' ]

Use /(?=[\s\S])/u instead of /(?=.)/u because . does not match newlines.

If you are still in ES5.1 era (or if your browser doesn't handle this regex correctly - like Edge), you can use this alternative (transpiled by Babel):

> '𝟘𝟙𝟚𝟛'.split(/(?=(?:[\0-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]))/);
[ '𝟘', '𝟙', '𝟚', '𝟛' ]

Note, that Babel tries to also handle unmatched surrogates correctly. However, this doesn't seem to work for unmatched low surrogates.

Test all in your browser:

Show code snippet

How did you form these characters? It looks like each character is 4 bytes. — user420667, May 18 '16 at 18:00
@user420667 the characters are from an additional character plane (in unicode table) with "big" codepoints therefore they don't fit into 16 bytes. The utf-16 encoding used in javascript presents these characters as surrogate pairs (special characters that are only used as pairs to form other characters from additional planes). Only the characters the main charachter plane are presented with 16 bytes. Surrugate pair special characters are also from the main character plane, if it makes sence. — Olga, Jan 11 '17 at 16:38
Performance of the different techniques, spread op looks like the champ (chrome 58). — Adrien, May 30 '17 at 3:45
Note that this solution splits some emoji such as 🏳️‍🌈, and splits combining diacritics mark from characters. If you want to split into grapheme clusters instead of characters, see stackoverflow.com/a/45238376. — user202729, Aug 30 '18 at 6:21
Note that while not breaking apart surrogate pairs is great, it isn't a general-purpose solution for keeping "characters" (or more accurately, graphemes) together. A grapheme can be made up of multiple code points; for instance, the name of the language Devanagari is "देवनागरी", which is read by a native speaker as five graphemes, but takes eight code points to produce... — T.J. Crowder, Sep 17 '18 at 12:08

insertusernamehere · Accepted Answer · 2018-09-17 12:01:55Z

The spread Syntax

You can use the spread syntax, an Array Initializer introduced in ECMAScript 2015 (ES6) standard:

var arr = [...str];

Examples

function a() {
    return arguments;
}

var str = 'Hello World';

var arr1 = [...str],
    arr2 = [...'Hello World'],
    arr3 = new Array(...str),
    arr4 = a(...str);

console.log(arr1, arr2, arr3, arr4);

Expand snippet

The first three result in:

["H", "e", "l", "l", "o", " ", "W", "o", "r", "l", "d"]

The last one results in

{0: "H", 1: "e", 2: "l", 3: "l", 4: "o", 5: " ", 6: "W", 7: "o", 8: "r", 9: "l", 10: "d"}

Browser Support

Check the ECMAScript ES6 compatibility table.

Further reading

spread is also referenced as "splat" (e.g. in PHP or Ruby or as "scatter" (e.g. in Python).

Demo

Try before buy

If you use the spread operator in combination with a compiler to ES5 then this wont work in IE. Take that into consideration. It took me hours to figure out what the problem was. — Stef van den Berg, Jun 21 '17 at 12:06

Community · Accepted Answer · 2020-06-20 09:12:55Z

14

You can also use Array.from.

var m = "Hello world!";
console.log(Array.from(m))

Expand snippet

This method has been introduced in ES6.

Reference

Array.from

edited 17 hours ago

Community♦

11 silver badge

answered Oct 13 '16 at 12:48

Rajesh

19.8k5 gold badges27 silver badges63 bronze badges

add a comment |

hashed_name · Accepted Answer · 2019-05-25 12:28:31Z

This is an old question but I came across another solution not yet listed.

You can use the Object.assign function to get the desired output:

var output = Object.assign([], "Hello, world!");
console.log(output);
    // [ 'H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd', '!' ]

Expand snippet

Not necessarily right or wrong, just another option.

Object.assign is described well at the MDN site.

That's a long way around to get to Array.from("Hello, world"). — T.J. Crowder, Sep 17 '18 at 11:53
@T.J.Crowder That's a long way around to get to [..."Hello, world"] — chharvey, Jun 28 '19 at 0:23

hashed_name · Accepted Answer · 2019-05-25 11:29:23Z

9

It already is:

var mystring = 'foobar';
console.log(mystring[0]); // Outputs 'f'
console.log(mystring[3]); // Outputs 'b'

Expand snippet

Or for a more older browser friendly version, use:

var mystring = 'foobar';
console.log(mystring.charAt(3)); // Outputs 'b'

Expand snippet

edited May 25 '19 at 11:29

hashed_name

4734 silver badges17 bronze badges

answered Dec 28 '10 at 16:43

dansimau

1,0291 gold badge8 silver badges11 bronze badges

4

-1: it isn't. Try it: alert("Hello world!" == ['H','e','l','l','o',' ','w','o','r','l','d']) – R. Martinho Fernandes Dec 28 '10 at 16:48
5

Sorry. I guess what I meant to say is: "you can access individual characters by index reference like this without creating a character array". – dansimau Dec 28 '10 at 16:50
3

Not reliably cross-browser you can't. It's an ECMAScript Fifth Edition feature. – bobince Dec 28 '10 at 17:25
8

The cross-browser version is mystring.charAt(index). – psmay Dec 28 '10 at 18:04
1

+1 for charAt()--though I'd prefer to use the array-ish variant. Darn IE. – Zenexer Jul 4 '14 at 2:57

| show 3 more comments

Mark Amery · Accepted Answer · 2020-02-01 22:34:35Z

There are (at least) three different things you might conceive of as a "character", and consequently, three different categories of approach you might want to use.

Splitting into UTF-16 code units

JavaScript strings were originally invented as sequences of UTF-16 code units, back at a point in history when there was a one-to-one relationship between UTF-16 code units and Unicode code points. The .length property of a string measures its length in UTF-16 code units, and when you do someString[i] you get the ith UTF-16 code unit of someString.

Consequently, you can get an array of UTF-16 code units from a string by using a C-style for-loop with an index variable...

const yourString = 'Hello, World!';
const charArray = [];
for (let i=0; i<=yourString.length; i++) {
    charArray.push(yourString[i]);
}
console.log(charArray);

Expand snippet

There are also various short ways to achieve the same thing, like using .split() with the empty string as a separator:

const charArray = 'Hello, World!'.split('');
console.log(charArray);

Expand snippet

However, if your string contains code points that are made up of multiple UTF-16 code units, this will split them into individual code units, which may not be what you want. For instance, the string '𝟘𝟙𝟚𝟛' is made up of four unicode code points (code points 0x1D7D8 through 0x1D7DB) which, in UTF-16, are each made up of two UTF-16 code units. If we split that string using the methods above, we'll get an array of eight code units:

const yourString = '𝟘𝟙𝟚𝟛';
console.log('First code unit:', yourString[0]);
const charArray = yourString.split('');
console.log('charArray:', charArray);

Expand snippet

Splitting into Unicode Code Points

So, perhaps we want to instead split our string into Unicode Code Points! That's been possible since ECMAScript 2015 added the concept of an iterable to the language. Strings are now iterables, and when you iterate over them (e.g. with a for...of loop), you get Unicode code points, not UTF-16 code units:

const yourString = '𝟘𝟙𝟚𝟛';
const charArray = [];
for (const char of yourString) {
  charArray.push(char);
}
console.log(charArray);

Expand snippet

We can shorten this using Array.from, which iterates over the iterable it's passed implicitly:

const yourString = '𝟘𝟙𝟚𝟛';
const charArray = Array.from(yourString);
console.log(charArray);

Expand snippet

However, unicode code points are not the largest possible thing that could possibly be considered a "character" either. Some examples of things that could reasonably be considered a single "character" but be made up of multiple code points include:

Accented characters, if the accent is applied with a combining code point
Flags
Some emojis

We can see below that if we try to convert a string with such characters into an array via the iteration mechanism above, the characters end up broken up in the resulting array. (In case any of the characters don't render on your system, yourString below consists of a capital A with an acute accent, followed by the flag of the United Kingdom, followed by a black woman.)

const yourString = 'Á🇬🇧👩🏿';
const charArray = Array.from(yourString);
console.log(charArray);

Expand snippet

If we want to keep each of these as a single item in our final array, then we need an array of graphemes, not code points.

Splitting into graphemes

JavaScript has no built-in support for this - at least not yet. So we need a library that understands and implements the Unicode rules for what combination of code points constitute a grapheme. Fortunately, one exists: orling's grapheme-splitter. You'll want to install it with npm or, if you're not using npm, download the index.js file and serve it with a <script> tag. For this demo, I'll load it from jsDelivr.

grapheme-splitter gives us a GraphemeSplitter class with three methods: splitGraphemes, iterateGraphemes, and countGraphemes. Naturally, we want splitGraphemes:

const splitter = new GraphemeSplitter();
const yourString = 'Á🇬🇧👩🏿';
const charArray = splitter.splitGraphemes(yourString);
console.log(charArray);

<script src="https://cdn.jsdelivr.net/npm/grapheme-splitter@1.0.4/index.js"></script>

Expand snippet

And there we are - an array of three graphemes, which is probably what you wanted.

KyleMit · Accepted Answer · 2019-03-28 21:03:27Z

2

+200

You can iterate over the length of the string and push the character at each position:

const str = 'Hello World';

const stringToArray = (text) => {
  var chars = [];
  for (var i = 0; i < text.length; i++) {
    chars.push(text[i]);
  }
  return chars
}

console.log(stringToArray(str))

Expand snippet

edited Mar 28 '19 at 21:03

KyleMit

60.2k46 gold badges316 silver badges487 bronze badges

answered Jun 28 '16 at 5:51

Mohit Rathore

4103 silver badges9 bronze badges

1

While this approach is a little more imperative than declarative, it's the most performant of any in this thread and deserves more love. One limitation to retrieving a character on a string by position is when dealing with characters past the Basic Multilingual Plan in unicode such as emojis. "😃".charAt(0) will return an unusable character – KyleMit Mar 28 '19 at 20:55
2

@KyleMit this seems only true for a short input. Using a longer input makes .split("") the fastest option again – Lux Mar 31 '19 at 12:03
1

Also .split("") seems to be heavily optimized in firefox. While the loop has similar performance in chrome and firefox split is significantly faster in firefox for small and large inputs. – Lux Mar 31 '19 at 12:13

add a comment |

ajit kumar · Accepted Answer · 2019-05-25 09:55:13Z

2

simple answer:

let str = 'this is string, length is >26';

console.log([...str]);

Expand snippet

answered May 25 '19 at 9:55

ajit kumar

4477 silver badges15 bronze badges

-1; this adds nothing that wasn't already included in hakatashi's answer. – Mark Amery Jan 17 at 21:17

add a comment |

user2301515 · Accepted Answer · 2020-02-18 18:19:37Z

0

One possibility is the next:

console.log([1, 2, 3].map(e => Math.random().toString(36).slice(2)).join('').split('').map(e => Math.random() > 0.5 ? e.toUpperCase() : e).join(''));

answered Feb 18 at 18:19

user2301515

3,9615 gold badges20 silver badges31 bronze badges

add a comment |

msand · Accepted Answer · 2019-03-31 23:31:40Z

-1

How about this?

function stringToArray(string) {
  let length = string.length;
  let array = new Array(length);
  while (length--) {
    array[length] = string[length];
  }
  return array;
}

answered Mar 31 '19 at 23:31

msand

3072 silver badges5 bronze badges

@KyleMit this seems faster than for i loop + push jsperf.com/string-to-character-array/3 – msand Mar 31 '19 at 23:33

add a comment |

f3tknco · Accepted Answer · 2020-01-07 01:52:54Z

-1

Array.prototype.slice will do the work as well.

const result = Array.prototype.slice.call("Hello world!");
console.log(result);

Expand snippet

answered Jan 7 at 1:52

f3tknco

11 bronze badge

add a comment |

How do you get a string to a character array in JavaScript?

12 Answers 12

Spread syntax (already answered by insertusernamehere)

Array.from

RegExp `u` flag

Test all in your browser:

Reference

Splitting into UTF-16 code units

Splitting into Unicode Code Points

Splitting into graphemes

Your Answer

Not the answer you're looking for? Browse other questions tagged javascript arrays string or ask your own question.

Linked

Hot Network Questions

12 Answers 12

Spread syntax (already answered by insertusernamehere)

Test all in your browser:

Reference

Splitting into UTF-16 code units

Splitting into Unicode Code Points

Splitting into graphemes

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged javascript arrays string or ask your own question.

Linked

Related