Computer Science/Data Structures

set() & dict() - ์ง‘ํ•ฉ๊ณผ ๋งต

metamong 2023. 8. 3.

๐Ÿ„๐Ÿฝ‍โ™‚๏ธ ๋Œ€ํ‘œ ์ž๋ฃŒ๊ตฌ์กฐ '์ง‘ํ•ฉ'๊ณผ '๋งต'์ด ์žˆ๋‹ค. ๋‘ ๊ฐœ์˜ ์ž๋ฃŒ๊ตฌ์กฐ๋Š” ๋งค์šฐ ํšจ์œจ์ ์ธ ์ž๋ฃŒ๊ตฌ์กฐ๋กœ ํŠน๋ณ„ํžˆ python์—์„œ ์–ด๋–ป๊ฒŒ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ•˜๊ณ , ๊ด€๋ จ ์—ฌ๋Ÿฌ ์—ฐ์‚ฐ ๊ธฐ๋Šฅ ๋ฐ time complexity๊นŒ์ง€ ์ž์„ธํžˆ ์•Œ์•„๋ณด์ž!

 

๐Ÿ„๐Ÿฝ‍โ™‚๏ธ ์‹œ๊ฐ„๋ณต์žก๋„ ๊ณ„์‚ฐ ์ƒ ํšจ์œจ์ ์ธ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๊ธฐ ์œ„ํ•ด ์ง‘ํ•ฉ๊ณผ ๋งต์ด๋ผ๋Š” ์ž๋ฃŒ๊ตฌ์กฐ๋ฅผ ํ™œ์šฉํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค(ํŠนํžˆ ์–ด๋ ค์šด PS ๋ฌธ์ œ!) ๊ฐ ์ž๋ฃŒ๊ตฌ์กฐ์˜ ํŠน์ง•์„ ์ •ํ™•ํžˆ ์•Œ์•„๋ณด์ž

Sets

overview

"unordered collection of data types that is iterable, mutable and has no duplicate elements. The order of elements in a set is undefined though it may consist of various elements. The major advantage of using a set, as opposed to a list, is that it has a highly optimized method for checking whether a specific element is contained in the set. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference."

 

→ ์ค‘๋ณต์„ ํ—ˆ์šฉํ•˜์ง€ ์•Š๋Š”๋‹ค๋Š” ์ ์ด ์ฃผ๋ชฉํ•  ๋งŒํ•œ ํŠน์ง•!

→ ๋˜ํ•œ unordered collection์œผ๋กœ ์ง‘ํ•ฉ ๋‚ด์— ์›์†Œ์˜ ์ˆœ์„œ๊ฐ€ ์ •ํ•ด์ ธ ์žˆ์ง€ ์•Š์œผ๋ฏ€๋กœ slicing์ด๋‚˜ indexing์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค.

methods & operations

โ‘  set()

: set() ํ•จ์ˆ˜๋กœ ์ง‘ํ•ฉ์„ ๋งŒ๋“ค์ˆ˜ ์žˆ๋‹ค. ์ด ๋•Œ, ์ด๋ฏธ ๋งŒ๋“  ๋ฌธ์ž์—ด, list, dictionary๋ฅผ set() ์ง‘ํ•ฉ์œผ๋กœ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค. (dictionary๋ฅผ set์œผ๋กœ ๋ฐ”๊พธ๋Š” ์ˆœ๊ฐ„ dictionary์˜ key๊ฐ€ ์ค‘๋ณต ์—†์ด ์ง‘ํ•ฉ์œผ๋กœ ์ €์žฅ๋จ. dictionary์˜ key ์ž์ฒด๊ฐ€ ์ค‘๋ณต์„ ํ—ˆ์šฉํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ dictionary์˜ key ๋ชจ์Œ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์Œ)

A = 'hello' #{'l', 'e', 'o', 'h'}
B = [0, 1, 2, 3, 3, 4, 5, 6] #{0, 1, 2, 3, 4, 5, 6}
C = ('a', 'a', 'b', 'c', 'd', 'e', 'f', 'g') #{'d', 'a', 'f', 'c', 'b', 'g', 'e'}
D = {'a': 1, 'b': 2, 'c': 3} #{'a', 'c', 'b'}

print(set(A))
print(set(B))
print(set(C))
print(set(D))

 

โ‘ก add()

: existing set์— ์›์†Œ๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ add()๋ผ๊ณ  ํ•œ๋‹ค. ์˜ค๋กœ์ง€ ๋‹จ 1๊ฐœ์˜ ์›์†Œ๋งŒ add() ํ•จ์ˆ˜๋กœ ๊ธฐ์กด ์ง‘ํ•ฉ์— ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, list๋Š” add ๋ถˆ๊ฐ€๋Šฅ. tuple์€ ๊ฐ€๋Šฅํ•˜๋‹ค.

 

โ‘ข update()

: add()์™€ ๋‹ค๋ฅด๊ฒŒ update()๋Š” ๋‘ ๊ฐœ ์ด์ƒ์˜ ์›์†Œ๋ฅผ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ๋‹ค. list, string, tuple, ๊ทธ๋ฆฌ๊ณ  ๋‹ค๋ฅธ set๊นŒ์ง€ ๋ชจ๋‘ ๊ฐ€๋Šฅํ•˜๋‹ค. ๋‹น์—ฐํžˆ ์ค‘๋ณต ์›์†Œ๋Š” ํ—ˆ์šฉํ•˜์ง€ ์•Š๊ฒŒ set๋ฅผ ์—…๋ฐ์ดํŠธํ•œ๋‹ค.

a = set()

a.add(1)
print(a) #{1}

a.update([2,3])
print(a) #{1,2,3}

a.update({4,5,5,6,6})
print(a) #{1,2,3,4,5,6}

 

โ‘ฃ in

: ์•ž์„œ ์–˜๊ธฐํ–ˆ์ง€๋งŒ, set์€ unordered collection์ด๋ฏ€๋กœ index๊ฐ€ ์กด์žฌํ•˜์ง€ ์•Š์•„ indexing / slicing์œผ๋กœ set ์•ˆ์˜ ์›์†Œ๋ฅผ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์—†๋‹ค. for loop์„ ์ž‘์„ฑํ•˜๊ฑฐ๋‚˜ in์„ ์ด์šฉํ•ด ์ง‘ํ•ฉ์— ์กด์žฌํ•˜๋Š” ์ง€์˜ ์—ฌ๋ถ€๋ฅผ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

a = {1,2,3,4,5,6}

for x in a:
    print(x,end=' ')

print(3 in a)
#1 2 3 4 5 6 True

 

โ‘ค remove() / discard() / pop() / clear()

: remove()์™€ discard()๋Š” ํŠน์ • ์›์†Œ๋ฅผ ์ง‘ํ•ฉ์—์„œ ์ œ๊ฑฐํ•˜๋Š” ํ•จ์ˆ˜์ด๋‹ค. remove()๋Š” ์ œ๊ฑฐํ•˜๋ ค๋Š” ์›์†Œ๊ฐ€ ์—†์œผ๋ฉด KeyError๋ฅผ ๋ฐœ์ƒ์‹œํ‚ค์ง€๋งŒ, discard()๋Š” ์ œ๊ฑฐํ•˜๋ ค๋Š” ์›์†Œ๊ฐ€ ์—†์œผ๋ฉด ๊ทธ๋ƒฅ passํ•œ๋‹ค. pop() ํ•จ์ˆ˜๋Š” ์ง‘ํ•ฉ์˜ ์›์†Œ ์ค‘ ๋žœ๋คํ•˜๊ฒŒ ํ•œ ๊ฐœ์˜ ์›์†Œ๋ฅผ ์„ ํƒํ•ด ์‚ญ์ œํ•˜๊ณ , clear()๋Š” set ์›์†Œ ์ „์ฒด๋ฅผ ์‚ญ์ œํ•˜๋Š” ํ•จ์ˆ˜์ด๋‹ค.

a = {1,2,3,4,5,6}

a.remove(4)
print(a) #{1,2,3,5,6}

a.discard(7)
a.discard(2)
print(a) #{1,3,5,6}

a.pop()
print(a) #{3,5,6}

a.clear()
print(a) #set()

 

โ‘ฅ union(|) / intersection(&) / difference(-) / symmetric difference(^)

: ์™ผ์ชฝ๋ถ€ํ„ฐ ์ฐจ๋ก€๋Œ€๋กœ ํ•ฉ์ง‘ํ•ฉ / ๊ต์ง‘ํ•ฉ / ์ฐจ์ง‘ํ•ฉ / ๋Œ€์นญ ์ฐจ์ง‘ํ•ฉ(์–‘์ชฝ ์ฐจ์ง‘ํ•ฉ ๋ชจ๋‘)

a = {1,2,3,4,5,6}
b = {1,2,3,5,7,8}

print(a.union(b)) #{1, 2, 3, 4, 5, 6, 7, 8}
print(a|b) #{1, 2, 3, 4, 5, 6, 7, 8}

print(a.intersection(b)) #{1, 2, 3, 5}
print(a&b) #{1, 2, 3, 5}

print(a.difference(b)) #{4,6}
print(a-b) #{4,6}

print(a.symmetric_difference(b)) #{4, 6, 7, 8}
print(a^b) #{4, 6, 7, 8}

advantages & disadvantages

→ Advantages:

โ˜… Unique Elements: Sets can only contain unique elements, so they can be useful for removing duplicates from a collection of data.

โ˜… Fast Membership Testing: Sets are optimized for fast membership testing, so they can be useful for determining whether a value is in a collection or not.

โ˜… Mathematical Set Oerations: Sets support mathematical set operations like union, intersection, and difference, which can be useful for working with sets of data.

โ˜… Mutable: Sets are mutable, which means that you can add or remove elements from a set after it has been created.

 

Disadvantages:

โ˜… Unordered: Sets are unordered, which means that you cannot rely on the order of the data in the set. This can make it difficult to access or process data in a specific order.

โ˜… Limited Functionality: Sets have limited functionality compared to lists, as they do not support methods like append() or pop(). This can make it more difficult to modify or manipulate data stored in a set.

โ˜… Memory Usage: Sets can consume more memory than lists, especially for small datasets. This is because each element in a set requires additional memory to store a hash value.

โ˜… Less Commonly Used: Sets are less commonly used than lists and dictionaries in Python, which means that there may be fewer resources or libraries available for working with them. This can make it more difficult to find solutions to problems or to get help with debugging.

time complexity

operation time complexity code
creating a new set O(1) a = set()
creating a set from an iterable O(N) a = set([1,2,3,4,...N])
adding an element O(1) a.add(1)
removing an element O(1) a.remove(1) / a.discard(2) / a.pop()
checking for element membership O(1) 1 in a
clearing the set O(1) a.clear()
union(|) O(len(s) + len(t)) s and t are the two sets being united
intersection(&) O(min(len(s), len(t))
difference(-) O(len(s)) s is the set from which elements are being removed
symmetric difference(^) O(len(s) + len(t)) s and t are the two sets being united

Dictionaries

overview

"dictionaries are indexed by keys, which can be any immutable type; strings and numbers can always be keys. Tuples can be used as keys if they contain only strings, numbers, or tuples; if a tuple contains any mutable object either directly or indirectly, it cannot be used as a key. You can’t use lists as keys, since lists can be modified in place using index assignments, slice assignments, or methods like append() and extend(). It is best to think of a dictionary as a set of key: value pairs, with the requirement that the keys are unique (within one dictionary). A pair of braces creates an empty dictionary: {}. Placing a comma-separated list of key:value pairs within the braces adds initial key:value pairs to the dictionary; this is also the way dictionaries are written on output. The main operations on a dictionary are storing a value with some key and extracting the value given the key. It is also possible to delete a key:value pair with del. If you store using a key that is already in use, the old value associated with that key is forgotten(change a value). It is an error to extract a value using a non-existent key.

techniques

โ‘  making a dictionary / change, delete, access

#1
tel = {'jack': 4098, 'sape': 4139}

#2
dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])

#3
dict(sape=4139, guido=4127, jack=4098)

#4
dict({1: 'Geeks', 2: 'For', 3: 'Geeks'})

#using tel dictionary
tel['guido'] = 4127 #add
tel['sape'] = 4111 #change
print(tel) #{'jack': 4098, 'sape': 4111, 'guido': 4127}
print(tel['jack']) #4098 - accessing

print('guido' in tel) # True - check key in a dictionary

del tel['sape'] 
print(tel) #{'jack': 4098, 'guido': 4127}

tel.pop('guido')
print(tel) #{'jack': 4098}

tel.popitem()
print(tel) #{} removes the last inserted key-value pair

 

โ‘ก looping, sorted, ๊ทธ ์™ธ ์—ฌ๋Ÿฌ dictionary ํ•จ์ˆ˜๋“ค

→ dictionary๋ฅผ for loop์œผ๋กœ ๋Œ๊ธฐ ์œ„ํ•ด dictionary_name.items()๋ฅผ ๋Œ๋ฆฐ๋‹ค.

for k, v in tel.items():
    print(k,v)

'''
jack 4098
sape 4111
guido 4127
'''

 

keys()์™€ values()๋กœ ๊ฐ๊ฐ key์™€ value list ์ถœ๋ ฅ

print(tel.keys()) #dict_keys(['jack', 'sape', 'guido'])
print(tel.values()) #dict_values([4098, 4111, 4127])

 

→ sort (dictionary sorting ๊ผญ ๊ธฐ์–ตํ•˜๊ธฐ!)

my_dict = {'banana': 3, 'apple': 1, 'orange': 2}

print(sorted(my_dict.items(), key=lambda item: item[1])) #Sorted based on values
#[('apple', 1), ('orange', 2), ('banana', 3)]

print(sorted(my_dict.items(), key=lambda item: item[0])) #Sorted based on keys
#[('apple', 1), ('banana', 3), ('orange', 2)]

print(sorted(my_dict.items(), key=lambda item: item[1], reverse = True)) #Sorted based on values in reverse order
#[('banana', 3), ('orange', 2), ('apple', 1)]

#or
print(sorted(my_dict.items(), key=lambda item: -item[1])) #only applied in numbers
#[('banana', 3), ('orange', 2), ('apple', 1)]

my_dict = {'banana': 1, 'apple': 1, 'orange': 2}
print(sorted(my_dict.items(), key=lambda item: (item[1], item[0]))) #sort based on values then keys
#[('apple', 1), ('banana', 1), ('orange', 2)]

time complexities

โ˜… hash collision์ด ์—†๋‹ค๋Š” ๊ฐ€์ • ํ•˜

+ ์ถ”๊ฐ€) checking a key in a dictionary(in operator) O(1) Time Complexity


* python set documentation 

* python set geeksforgeeks 

 

* python dictionary documentation 

* python dictionary geeksforgeeks 

 

'Computer Science > Data Structures' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

๐ŸฅชArray & Linked List  (2) 2024.01.08
๐ŸŒณ Binary Tree ๐ŸŒณ  (1) 2023.12.19
Binary heap  (0) 2023.02.06
Types of Binary Tree๐ŸŒฒ  (0) 2023.01.10
priority queue(binary heap)  (0) 2023.01.09

๋Œ“๊ธ€