2021

Aug 05

2021-08-05 ·

python

Python Multiprocessing

2021

Jun 30

2021-06-30 ·

python

Jupyter multiple environment

conda activate base
conda install -c conda-forge nb_conda
jupyter notebook

2021

Jan 20

2021-01-20 ·

python

Compile Pytorch with cuda/10.0 on hipergator

srun -p gpu --ntasks=1 --cpus-per-task=10 --gpus=geforce:1 --time=02:00:00 --mem=50gb  --pty -u bash -i

prepare environment

conda activate base
conda install cudatoolkit=10.0
conda install -c pytorch magma-cuda100  # also install magma in base

conda create -n cuaev python=3.7
conda activate cuaev
conda install cudatoolkit=10.0  # yeah do it again
conda install -c pytorch magma-cuda100
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
module load cuda/10.0.130
module load gcc/7.3.0
export CUDA_HOME=/apps/compilers/cuda/10.0.130

Install pytorch (from pytorch/pytorch )

conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
# if you are updating an existing checkout
git submodule sync
git submodule update --init --recursive
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
python setup.py install  # will take about half a hour

Check pytorch

python -c 'from torch.utils.collect_env import get_pretty_env_info; print(get_pretty_env_info())'

2020

Nov 17

2020-11-17 ·

python

Pytorch

Basic

Quantization

2020

Nov 13

2020-11-13 ·

python

Python String 对齐

str.ljust() 
str.rjust() 
str.center()

header_lens = {'function': 10, 'time': 10, 'count': 10}
timers = {'run': 10, 'forward': 8, 'backward': 4}
counts = {'run': 1, 'forward': 1, 'backward': 1}
for h, l in header_lens.items():
    print(h.rjust(l), end ="")
print()
for k in timers:
    print(k.rjust(header_lens['function']), end ="")
    print(str(timers[k]).rjust(header_lens['time']), end ="")
    print(str(counts[k]).rjust(header_lens['count']), end ="")
    print()

output

  function      time     count
       run        10         1
   forward         8         1
  backward         4         1

2020

Jul 18

2020-07-18 ·

python

Python参数传递

可变对象和不可变对象

a = [1, 2, 3]
def mutate(a):
    a += [4]
print(a)  # [1, 2, 3]
mutate(a)
print(a)  # [1, 2, 3, 4]

b = 1
def mutate(b):
    b += 1
print(b)  # 1
mutate(b)
print(b)  # 1

函数也是对象，默认参数只初始化一次

def test(b=[]):
    b += [1]
    print(b)

test()  # [1]
test()  # [1, 1]
test()  # [1, 1, 1]

def test(b=None):
    b = b or []
    b += [1]
    print(b)

test()  # [1]
test()  # [1]
test()  # [1]

函数默认值在定义时初始化，而不是执行时

i = 1
def test(a=i):
    print(a)

i = 2
test()  # 1

2020

Jul 17

2020-07-17 ·

python

Multiprocessing

Contexts and start methods

fork: child process typically can access the dataset and Python argument functions directly through the cloned address space.
spawn: pickle all parent data, slower

Reference:

Wrap a function to an object with several input

import functools
a = functools.partial(f, n=20)

Reference: Python multiprocessing a function with several inputs - Stack Overflow

Example

processes = []
torch.multiprocessing.set_start_method('fork')
for rank in range(self.world_size):
    p = torch.multiprocessing.Process(target=worker, args=(rank, self.config))
    p.start()
    processes.append(p)

for p in processes:
    p.join()

2020

Mar 14

2020-03-14 ·

python

Python tips and tricks

List Comprehensions

# [ expression for item in list if conditional ]
def square(x):
    return x**2
a = [square(i) for i in range(10) if (i % 2 == 0)]
# [0, 4, 16, 36, 64]

if else

>>> a = [10, -1, 1, -10, None, 0, None, 0]
>>> [x if x > 0 else -x for x in a if x]
[10, 1, 1, 10]

dataclass, Python 3.7 中 dataclass

from dataclasses import dataclass
@dataclass
class Card:
    rank: str
    suit: str
card = Card("Q", "hearts")
print(card == card)
# True

Slicing a list

a[start:stop:step] , default is a[0:-1:1]

"abcdefgh"[::2]
# 'aceg'
"abcd"[::-1]
# 'dcba'

map

def upper(s):
    return s.upper()
mylist = list(map(upper, ['sentence', 'fragment']))
# ['SENTENCE', 'FRAGMENT']

Ternary Operator For Conditional Assignment

x = "Success!" if (y == 2) else "Failed!"

Integer division

Python 3
5 / 2 = 2.5
5 // 2 = 2

Avoid using np.append for big array in for loop

In [1]: import numpy as np                                                                                                                                                  

In [2]: %%timeit 
   ...: a = np.array(1) 
   ...: for i in range(5000): 
   ...:     a = np.append(a, 1) 
   ...:                                                                                                                                                                     
31.8 ms ± 615 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [3]: %%timeit 
   ...: a = [1] 
   ...: for i in range(5000): 
   ...:     a.append(1) 
   ...:                                                                                                                                                                     
380 µs ± 9.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Reference:

30 Python Best Practices, Tips, And Tricks - Towards Data Science

2020

Feb 03

2020-02-03 ·

python

Python notes

copy.copy() and copy.deepcopy()

>>> a = [1, 2, [3]]
>>> c = copy.deepcopy(a)
>>> b = copy.copy(a)
>>> a[-1][0] = 4
>>> a
[1, 2, [4]]
>>> b
[1, 2, [4]]
>>> c
[1, 2, [3]]

is and ==

a = 1
b = 1
a == b  # True  | Value
a is b  # False | Object

Sort

intervals = [[1, 2], [2, 3], [5, 8], [0, 1]]
intervals.sort(key=lambda x: x[0])

from functools import cmp_to_key
def cmp_xy(x, y):
    spe_dict = {'C':1, 'H':0, 'N':1, 'O':2}
    x = spe_dict[x]
    y = spe_dict[y]

    if x > y:
        return 1
    elif x < y:
        return -1
    else:
        return 0

a = ['C', 'H', 'N', 'O']
a.sort(key=cmp_to_key(cmp_xy))
print(a)  # ['H', 'C', 'N', 'O']

Ref: Reference

a = ['C', 'H', 'N', 'O']
spe_dict = {'C':1, 'H':0, 'N':1, 'O':2}
a.sort(key=lambda x: spe_dict[x])
print(a) # ['H', 'C', 'N', 'O']

dict.values()
dict.keys()
dict.items()

max(dict, key=dict.get)
max(dict.keys(), key=lambda k: dict[k])

2019

Dec 08

2019-12-08 ·

python

Python Internal

Book Inside The Python Virtual Machine pdf | Read online Python源码剖析 pdf
谈谈 Python 程序的运行原理 | 淡水网志
- Python垃圾回收机制 - hbprotoss的博客
- Python内存池管理与缓冲池设计 - 张知临的专栏
CPython的Global Interprate Lock（GIL）： Preventing multiple threads from executing Python bytecodes at once.

大家听说过对 CPython的GIL的抱怨不?经常听到对不对? 有多少一般 Python用户知道吐槽GIL其实真的在吐槽的就是 CPython的引用计数及C API实现? -- 知乎
Python的全局解释器锁（GIL） - 简书

Done is better than perfect

2019

Aug 14

2019-08-14 ·

python

Python String format

"{:,}".format(123456789)                                            
'123,456,789'

'CPU  {: 4d}%'.format(100)
'CPU  {: 4d}%'.format(10)
'CPU  {: 4d}%'.format(0)
CPU   100%
CPU    10%
CPU     0%

'CPU  {:04d}%'.format(100)
'CPU  {:04d}%'.format(10)
'CPU  {:04d}%'.format(0)
CPU  0100%
CPU  0010%
CPU  0000%

If there is {} in string, change it to {{}}

"dict = {{'a': {}}}".format(1)
"dict = {'a': 1}"

Ref: str.format()