Skip to content Skip to sidebar Skip to footer

Split When Character Changes In Python

So I have this string. 6#666#665533999 And I want to parse it into multiple small strings(or until the character changes) and ignoring the # so that I can substitute 6 = M or 666

Solution 1:

Use itertools.groupby:

>>> c = "6#666#665533999"
>>> ["".join(g) for k, g ingroupby(c) if k != '#']
['6', '666', '66', '55', '33', '999']

Then have a dictionary which maps each of these sets to a character in the dial pad.

cmap = {'77': 'Q', '9999': 'Z'} # And so forth..

Solution 2:

This is basically the same logic, just wrapped up a bit more tidily:

from itertools import groupby

phone_chars = {
    "2": " ABC",
    "3": " DEF",
    "4": " GHI",
    "5": " JKL",
    "6": " MNO",
    "7": " PQRS",
    "8": " TUV",
    "9": " WXYZ",
    "#": ["", ""]
}

defdecode(digit_str):
    return"".join(phone_chars[ch][len(list(it))] for ch,it in groupby(digit_str))

then

>>>print(decode("6#666#665533999"))
MONKEY

Edit:

Python has very good online documentation.

itertools.groupby accepts an input sequence and an optional key parameter which is an evaluation function - it accepts a value, does something to it, and returns the result. If the key function is not given, it defaults to identity (ie key = lambda x: x, take a value and return the same value).

It then applies this evaluation function to each item in the input sequence and returns a sequence of (evaluated_value, iter(consecutive_items_having_the_same_evaluated_value)).

So

groupby("AABBBCCCDD")

gets you

iter((
    ("A", iter(("A", "A"))),
    ("B", iter(("B", "B", "B"))),
    ("C", iter(("C", "C", "C"))),
    ("D", iter(("D", "D")))
))

or (demonstrating a custom evaluation function)

groupby([1, 2, 3, 4, 5, 6, 7], key=lambda x: x//3)

gives

iter((
    (0, iter([1, 2])),      # 1//3 == 2//3 == 0
    (1, iter([3, 4, 5])),   # 3//3 == 4//3 == 5//3 == 1
    (2, iter([6, 7]))       # 6//3 == 7//3 == 2
))

Note that iterators only give you one value at a time, and len won't work on one because there is no way to know how many values the iterator might return. If you need to count the returned values, the simplest way is len(list(iterable)) - grab all returned values into a list and then see how many items the list has. So if we do

[(ch, len(list(it))) for ch,it in groupby("6#666#665533999")]

what we get back is

[
    ('6', 1),   # => 'M'
    ('#', 1),   # => ''
    ('6', 3),   # => 'O'
    ('#', 1),   # => ''
    ('6', 2),   # => 'N'
    ('5', 2),   # => 'K'
    ('3', 2),   # => 'E'
    ('9', 3)    # => 'Y'
]

which (by design) happens to be exactly the required index values into phone_chars. We use the index values to get the corresponding characters - ie phone_chars['6'][1] == 'M' - join them together using "".join(), and return the resulting string ("MONKEY").

Solution 3:

defencode(nums):
    return ["".join(g) for k, g in itertools.groupby(nums) if k != '#']


chars = {str(n):dict(enumerate(chrs,1)) for n,chrs inenumerate("ABC DEF GHI JKL MNO PQRS TUV WXYZ".split(), 2)}

nums = "6#666#665533999"
nums = encode(nums)
message = ''.join(chars[n[0]][len(n)] for n in nums)

In [28]: message
Out[28]: 'MONKEY'

Post a Comment for "Split When Character Changes In Python"