Tuesday, May 21, 2019

Python Sets

Links: Journey to Data Scientist


Characteristics
  • Unordered
  • Hashable
  • Unique
  • Immutable
  • Support mathematical operations like union, intersection, difference, and symmetric difference

1. Unordered (in fact, output is ordered by hashed value)
Example #1
s = {"abc",123,9.99,"DEF",3.142,100}
print(s)
print(type(s))

Output:
{3.142, 100, 9.99, 'abc', 'DDf', 123}
<class 'set'>

Example #2
s = s = {"abc","123","9.99","DDf","3.142","100"}
print(s)
print(type(s))

Output:
{'100', 'abc', 'DDf', '3.142', '9.99', '123'}
<class 'set'>

Example #1 and #2 above show different ordering of output compare to their original definition. This is because Python will hash the elements first. Different data type will use different algorithm.


2. Common Operations
s = set() #empty set
print(s)
s = {"abc",123,9.99,"DDf",3.142,100} #100 will be the first one to be pop-ed
print(s)
print("Length of s: ",len(s)) #number of element in set
s.add("one item") #add one element
print("Added 'one item':",s)
s.update(("i1","i2","i3")) #add multiple items
print("Updated 'i1, i2, i3':",s)
s.remove(3.142)
print("Removed 3.142: ",s)
s.discard(99999) #this will not trigger error
print("Discarded 99999 (no effect): ",s)
s.discard("DDf")
print("Discarded DDf: ",s)
s.pop()
print("Poped last item (in original definition): ",s)
s.clear() #make it an empty set
print(s)

s.remove(99999) #this will trigger error because 99999 does not exist in the set
del s #this will release 's' from memory, this variable will become undefined

Output:
set()
{3.142, 100, 9.99, 'abc', 'DDf', 123}
Length of s: 6
Added 'one item': {3.142, 100, 'one item', 9.99, 'abc', 'DDf', 123}
Updated 'i1, i2, i3': {3.142, 100, 'one item', 'i2', 9.99, 'abc', 'DDf', 'i1', 123, 'i3'}
Removed 3.142: {100, 'one item', 'i2', 9.99, 'abc', 'DDf', 'i1', 123, 'i3'}
Discarded 99999 (no effect): {100, 'one item', 'i2', 9.99, 'abc', 'DDf', 'i1', 123, 'i3'}
Discarded DDf: {100, 'one item', 'i2', 9.99, 'abc', 'i1', 123, 'i3'}
Poped last item (in original definition): {'one item', 'i2', 9.99, 'abc', 'i1', 123, 'i3'}
set()


2. Mathematical Operations
s1 = {1,3,4,8,9}
s2 = {5,7,6,8,9}

print("Difference: ",s1.difference(s2))
print("Intersection: ",s1.intersection(s2))
print("IsDisjoint: ",s1.isdisjoint(s2))
print("IsSubSet: ",s1.issubset(s2))
print("IsSuperSet: ",s1.issuperset(s2))
print("Symmetric Difference: ",s1.symmetric_difference(s2))
print("Union: ",s1.union(s2))

Output:
Difference: {1, 3, 4}
Intersection: {8, 9}
IsDisjoint: False
IsSubSet: False
IsSuperSet: False
Symmetric Difference: {1, 3, 4, 5, 6, 7}
Union: {1, 3, 4, 5, 6, 7, 8, 9}

s1 = {1,3,4,8,9}
s2 = {5,7,6,8,9}

s1.difference_update(s2)
print("After Difference Update: ",s1)

Output: After Difference Update: {1, 3, 4}

s1 = {1,3,4,8,9}
s2 = {5,7,6,8,9}

s1.intersection_update(s2)
print("After Intersection Update: ",s1)

Output: After Intersection Update: {8, 9}

s1 = {1,3,4,8,9}
s2 = {5,7,6,8,9}

s1.symmetric_difference_update(s2)
print("After Symmetric Difference Update: ",s1)

Output: After Symmetric Difference Update: {1, 3, 4, 5, 6, 7}

No comments: