A substring is a subset of characters from a string. Extracting substrings in SPSS is done with CHAR.SUBSTR (SPSS versions 16+) or just SUBSTR (SPSS versions 15-). CHAR.SUBSTR takes two or three arguments as shown by the minimal example below.
SPSS CHAR.SUBSTR – Minimal Example
COMPUTE var_2 = CHAR.SUBSTR(var_1,3,2).
The three arguments mean the following:
- var_1 denotes the variable from which the substring is taken;
- 3 is the first character that’s extracted;
- 2 is the number of characters to extract.
Altogether, this first example implies that var_2 will consist of characters 3 and 4 of var_1.
SPSS Substring Syntax Examples
The examples below use webdesigners.sav.
get file ‘webdesigners.sav’.
string fname lname company tld (a30).
compute fname = char.substr(email,1,1).
execute.
compute fname = char.substr(email,3,2).
execute.
compute fname = char.substr(email,1,char.index(email,’.’) – 1).
execute.
compute fname = concat(upper(char.substr(fname,1,1)),char.substr(fname,2)).
execute.
compute lname = char.substr(email,char.index(email,’.’) + 1,char.index(email,’@’) – 1 – char.index(email,’.’)).
execute.
compute lname = concat(upper(char.substr(lname,1,1)),char.substr(lname,2)).
execute.
compute company = char.substr(email,char.index(email,’@’) + 1).
execute.
compute tld = char.substr(company,char.rindex(company,’.’) + 1).
execute.
document ‘bal’.
document ‘bol’.
display documents.
Explanation
- In SPSS, a substring can be extracted by using
CHAR.SUBSTR(a,b,c). - Here,
arefers to the string from which the substring should be taken. - The second argument
bindicates the starting position (“start at the bth letter”) - The third argument
cis the length of the substring. It may be omitted, in which case all characters after the starting position will be extracted. - As seen in the second example,
aandbdon’t have to be static numbers. They may be replaced by (for example) the position of the last space in a string, which is returned by RINDEX. - The
CHARprefix may often be omitted. Exactly when is explained in Unicode mode. - Just
SUBSTRINGcan be used for modifying the original string in many cases. This is shown in the final example.
Python Substring Examples
pets = ‘Cat Dog Rat’
print pets[4:7]
print pets[pets.rfind(” “) +1:]
end program.
Explanation
- In Python, a substring can be extracted from a string by using square brackets
[]. The latter enclose the relevant index or indices of the character(s) to be extracted. - This operation is called slicing. (Slicing is used for more than just the substring function. For instance,
mylist[1]would return the second element from a list called “mylist”.) - A range of characters is specified by a colon
:. - For example,
[1:4]returns the second through the fourth elements. This is because it uses the start index as given and (the end index – 1). - In a similar vein, if the start index is omitted (as in
[:4]) it will return the first through the fourth element. - Finally, if the end index is omitted (
[1:]), the second through the final elements are returned.
