2022-01-24 게시 됨2022-02-01 업데이트 됨boostcamp / Peer Session1분안에 읽기 (약 147 단어)

Week2 Peer Session

1. tensor가 벡터형태 일 때 Backward를 진행하면 왜 아래와 같이 표기해야하나?

a = tensor([3, 2])
b = tensor([4, 1])
Q = a*2 + b**2
Q.backward(gradient = tensor([1, 1]))

Pytorch에서는 scalar 값이 아닌 tensor에서는 backward의 시작점으로 보지 않기 때문에 벡터에 따로 gradient를 지정해 줘야한다.

2. optimizer.zero_grad()

Gradient를 초기화 해주는 함수를 말한다
만약 초기화 해주지 않는다면 tensor가 backward 연산될때마다 grad에 더해져서 제대로 학습되지 않게 될 것이다

3. GIL

Global Interpreter Lock
Global Interpritor Lock

2022-01-24 게시 됨2022-03-09 업데이트 됨boostcamp / week9분안에 읽기 (약 1366 단어)

부스트 캠프 ai tech 2주 2일차 Pytorch (3)

3. torch.nn

Pytorch의 Nerual Network와 관련된 기능들이 있는 모듈이다
Neural Network와 관련된 Layer, Function들이 속해있다
- Layer : 1층의 인공신경망을 이야기한다. input으로 들어온 값을 선형연산이나, 비선형연산을 통해 output을 return해 주는 class이다
- Function : 활성화함수 등의 Neural Network의 연산을 하기위해 필요한 함수을 이야기한다

3.1 nn.Module

Custom Network(모델)를 만들기 위해서 지원하는 module이다
nn.Module은 내부에 Module을 포함할 수 있다
- 여러층으로 쌓이는 모양으로 인해 Layer라고도 부른다
- Layer가 모여서 Model을 이룬다
기본적으로 아래와 같은코드를 베이스로 만들 수 있다
- super : nn.Module에서 Attribute를 상속받기위한 선언. 이것이 없으면 빈 깡통 클래스이다
- forward : 순전파를 구현하는 함수
  1
  2
  3
  4
  5
  6
  class TestNet(nn.Module):
  def __init__(self):
  super(TestNet, self).__init__()
  
  def forward(self, x):
  return x_out

3.2 Container

Layer들을 묶어서 보관하기 위한 저장소
Containers - PyTorch 공식 문서

nn.Sequential()
- 여러 모듈을 하나로 묶어서 하나의 모듈처럼 사용할 수 있는 Container
- 순차적인 Layer들을 하나로 묶어서 깔끔하게 관리 할 수 있다
nn.ModuleList()
- 여러 모듈을 list처럼 한군데 모아두는 Container
- indexing을 통해 필요한 모듈을 꺼내 쓸 수 있다
- 일반적인 List와 다르게 Attribute여도 Class를 print할 때 외부에 출력된다
nn.ModuleDict()
- 여러 모듈을 dict처럼 한군데 모아두는 Container
- Key값을 통해 필요한 모듈을 불러올 수 있다
- ModuleList()와 같이 Class를 print할 때 외부에 출력된다

3.3 Parameter & Buffer

Parameter
- 모듈안에 임시로 저장되는 특별한 Tensor
- 일반적인 Tensor attribute와는 다르게 기울기 계산이 가능하고, 모델저장시에 같이 저장된다
- RNN 같이 parameter가 반복되고, 갱신이 필요한 경우 사용된다
- 또한 모듈속의 내부모듈들의 tensor는 전부 parameter로 지정된다
- Parameter()로 선언 할 수 있다
Buffer
- 모듈안에 임시로 저장되는 Tensor
- 모델저장시에 같이 저장된다
- config용의 정보등을 저장할 때 사용한다
- nn.Module의 register_buffer로 등록할 수 있다

3.4 Module 내부 살펴보기

nn.module에는 내부의 여러 attribute를 볼 수 있는 기능이 존재한다
내부의 모듈, Parameter, buffer 등 여러 attribute가 ObjectDict형태로 저장되어 불러올 수 있다

submodule
- 모듈속 모듈인 submodule은 아래의 함수들로 살펴 볼 수 있다
- named_children
  - module에 바로 아래단계에 속한 submodule만 보여준다
- named_modules
  - submodule 뿐만아니라 module에 속해있는 모든 module을 보여준다
parameter
- named_parameters를 통해 parameter를 호출이 가능하다
buffer
- named_buffers를 통해 buffer 호출이 가능하다

3.5 hook

package화 된 코드에서 custom 코드를 중간에 실행시킬 수 있도록 만들어 놓은 인터페이스
pytorch에는 등록하는 대상에 따른 2가지 종류의 hook
- Tensor에 등록하는 Tensor hook
- Module에 등록하는 Module hook
실행 시점에 따른 5가지 종류의 hook이 존재한다
- forward pre hooks : forward 연산 전에 실행되는 hook
- forward hooks : forward 연산 후에 실행되는 hook
- backward_hooks : backward 연산이 수행될때 마다 실행되는 hook. 현재는 사용하는걸 권장하지 않는다
- full backward hooks : backward 연산이 수행될때 마다 실행되는 hook
- state dict hooks : load_state_dict 함수가 모듈 내부에서 실행하는 hook, 직접적으로 user가 잘 사용하지는 않는다

Tensor hook
- Tensor에 대한 Backward Propagation 후에 작동하는 hook
- torch.Tensor.register_hook 을 통하여 hook을 등록 할 수 있다
- torch.Tensor._backward_hooks 을 통하여 등록한 hook을 확인 할 수 있다
  1
  2
  3
  4
  def hook(grad):
  pass
  tensor.register_hook(hook)
  tensor_backward_hooks() # OrderedDict([(0, <function __main__.hook(grad)>)])
Module hook
- Module hook은 3개의 종류의 hook으로 사용된다
- forward pre hooks
- forward hooks
- ~~backward_hooks~~
- full backward hooks

forward pre hooks
- forward 연산이 일어나기 전 시점에서 실행되는 hook
- parameter로 module과 input으로 받고 input을 수정해서 return 할 수 있다
- Module.register_forward_pre_hook(hook)으로 등록이 가능하다
  1
  forward_pre_hook(module, input) -> None or modified input
forward hooks
- forward 연산이 일어난 뒤 시점에서 실행되는 hook
- parameter로 module, input, output으로 받고, output을 수정해서 return 할 수 있다
- input값또한 수정이 가능하지만 forward 연산에 변화는 없다
- Module.register_forward_hook(hook)으로 등록이 가능하다
  1
  forward_hook(module, input, output) -> None or modified output
full backward hooks
- backward 연산이 수행될때 마다 실행되는 hook
- parameter로 module, grad_input, grad_output으로 받고, 새로운 grad_input return 할 수 있다
- parameter인 grad_input 자체를 수정하면 Error가 발생할 수 있다
- Module.register_full_backward_hook(hook)으로 등록이 가능하다
  1
  full_backward_hooks(module, grad_input, grad_output) -> None or modified grad_input

3.6 apply

특정 함수를 Module과 Module에 속한 submodule에 적용하는 함수
weight 초기화나, 내부 모듈에 특정한 method를 추가할 때 사용할 수 있다

weight_initialization

def weight_initialization(module):
    module_name = module.__class__.__name__
    if 'Function' in module_name:
        module.W.data.fill_(1)

make_method

def function_repr(self):
    return f'name={self.name}'

def add_repr(module):
    module_name = module.__class__.__name__
    if 'Function' in module_name:
        module.extra_repr = partial(function_repr, module)

2022-01-24 게시 됨2022-01-27 업데이트 됨boostcamp / 일상 / TIL / Dairy몇 초안에 읽기 (약 91 단어)

Week2 - Day 1 Review

오늘 하루 한 것

강의
- pytorch 1, 2, 3 강의
과제
- pytorch 기본과제1 중간정도
정리
- pytorch 1, 2, 3 강의, 기본과제1

피어세션에서 한 것

backward 함수의 파라미터로 gradient를 넘기는것에 대한 의미
reshape, view 차이점

내일 할것

기본과제 정리 끝내기

하루 느낀점

부덕아… 눈이 너무 아파

2022-01-24 게시 됨2022-01-25 업데이트 됨boostcamp / week8분안에 읽기 (약 1256 단어)

부스트 캠프 ai tech 2주 1일차 Pytorch (2)

2. 유용한 torch 함수들

torch의 내장함수들 중 자주 쓰일만한 녀석들의 정리글이다
pytorch 공식문서 - 링크

Tensors
Creation Ops
indexing, Slicing, Joining, Mutating

2.1 Tensors

is_*
- 데이터 형태가 tensor인지 판단, tensor의 내부 데이터 등의 여러가지 판단을 하는 함수
  1
  2
  3
  x = torch.tensor([0])
  is_tensor(x) # True
  is_nonzero(x) # False, input : single element tensor
torch.numel(x)
- 전체 element가 몇개인지 출력하는 함수
  1
  2
  a = torch.randn(1, 2, 3, 4, 5)
  torch.numel(a) # 120

2.2 Creation Ops

torch.from_numpy
- ndarray를 torch.Tensor로 바꾸는 함수
torch.zeros(size), empty(size), ones(size)
- 0, 빈, 1로 이루어진 tensor를 size 형태로 생성하는 함수
- numpy와 같은 기능을 한다
torch.zeros_like(tensor), empty_like(tensor), ones_like(tensor)
- tensor의 size를 가진 0, 빈, 1로 이루어진 tensor를 생성하는 함수
- numpy와 같은 기능을 한다
torch.arrange(start, end, step)
- numpy의 arrange와 같은 기능을 하는 함수
- start 부터 end 까지 step마다의 수를 가진 1D-tensor를 생성한다
torch.linspace(start, end, steps)
- start에서 end의 구간의 길이를 steps개로 균등하게 나누는 1D-tensor를 생성한다
torch.full(size, fill_value), torch.full_like(tensor, fill_value)
- fill_value로 채워진 tensor를 생성한다

2.3 indexing, Slicing, Joining, Mutating 함수

torch.index_select(input, dim, index)

특정한 index에 위치한 데이터를 모아서 return 해주는 함수

A = torch.Tensor([[1, 2], [3, 4]])
torch.index_select(A, 1, torch.tensor([0]))
===========================================
output:
tensor([[1.],
        [3.]])

torch.gather(input, dim, index)

특정한 index들에 위치한 데이터를 모아서 return 해주는 함수

t = torch.tensor([[1, 2], [3, 4]])
torch.gather(t, 1, torch.tensor([[0, 0], [1, 0]]))
==================================================
output:
tensor([[ 1,  1],
        [ 4,  3]])
==================================================
index calculate:
out[i][j][k] = input[index[i][j][k]][j][k]  # if dim == 0
out[i][j][k] = input[i][index[i][j][k]][k]  # if dim == 1
out[i][j][k] = input[i][j][index[i][j][k]]  # if dim == 2

torch.cat(tensors, dim) == torch.concat
- tensors들을 합치는 함수
- 기준이 되는 dim을 제외하고 같은 shape를 가지고 있어야한다
  1
  2
  3
  x = torch.rand(1, 3)
  y = torch.rand(2, 3)
  torch.cat((x,y), 0).size() # torch.Size([3, 3])

torch.chunk(input, chunks, dim)

tensor를 chunk의 갯수만큼으로 분리해주는 함수
chunks의 갯수가 넘어가지 않는 선에서 같은 size의 tensor로 분리해준다

나누어 떨어지지 않는경우 마지막 tensor의 사이즈의 크기가 다를 수도 있다

torch.arange(13).chunk(6)
=========================
output:
(tensor([0, 1, 2]),
 tensor([3, 4, 5]),
 tensor([6, 7, 8]),
 tensor([ 9, 10, 11]),
 tensor([12]))

t = torch.tensor([[1, 2, 3],
                  [4, 5, 6]])
print(torch.chunk(t, 2, 1))
===========================
output:
(tensor([[1, 2],
        [4, 5]]), tensor([[3],
        [6]]))

torch.Tensor.scatter_(dim, index, src, reduce=None)

Tensor에 index에 맞춰서 src를 삽입하는 함수이다
reduce에 add, multiple을 넣어서 더하거나 곱하기도로 바꿀 수 있다

torch.gather와 반대로 작동한다

src = torch.arange(1, 11).reshape((2, 5)) 
# tensor([[ 1,  2,  3,  4,  5], [ 6,  7,  8,  9, 10]])
index = torch.tensor([[0, 1, 2, 0]])
torch.zeros(3, 5, dtype=src.dtype).scatter_(0, index, src)
==========================================================
output:
tensor([[1, 0, 0, 4, 0],
        [0, 2, 0, 0, 0],
        [0, 0, 3, 0, 0]])
==========================================================
index calculate:
self[index[i][j][k]][j][k] = src[i][j][k]  # if dim == 0
self[i][index[i][j][k]][k] = src[i][j][k]  # if dim == 1
self[i][j][index[i][j][k]] = src[i][j][k]  # if dim == 2

torch.stack(tensors, dim)
- 지정하는 차원으로 확장해서 tensor를 쌓아주는 함수이다
- 두 차원이 정확하게 일치해야 쌓기가 가능하다
  1
  2
  3
  x = torch.rand(3, 1, 3) # 3, 1, 3
  y = torch.rand(3, 1, 3) # 3, 1, 3
  torch.stack((x,y), dim=2).size() #torch.Size([3, 1, 2, 3])
  2.4 random Sampling

자주 쓰이지만 numpy와 비슷해서 문서를 참고하는편이 좋을듯 하다
Random sampling - PyTorch 공식 문서
torch.seed(), torch.manual_seed(int)
- Seed값을 고정해서 랜덤한 변수를 고정시킬 수 있다
- manual_seed는 직접 시드값을 입력할 수 있다

2.5 Pointwise Ops

수학 연산과 관련된 기능을 포함하는 함수군
- numpy와 비슷하다

torch.sqrt(tensor)
- 각 tensor의 element에 대한 제곱근을 구해주는 함수
torch.exp(tensor)
- 각 tensor의 element에 대한 $e^x$
torch.pow(tensor)
- 각 tensor의 element에 대한 $x^2$

2.6 Reduction Ops

조건에 따라 특정한 tensor의 값을 가져오는 함수군
대부분 numpy와 동일하게 작동한다
Reduction Ops - PyTorch 공식 문서

2.7 Comparison Ops

비교와 관련된 기능을 포함하고 있는 함수군
Comparison Ops - PyTorch 공식 문서

torch.argsort(tensor)

tensor를 sort하는 index를 return 해준다

a = torch.randint(1, 10, (3, 3))
a
torch.argsort(a)
================
output:
tensor([[9, 5, 3],
        [6, 4, 2],
        [5, 8, 6]])
tensor([[2, 1, 0],
        [2, 1, 0],
        [0, 2, 1]])

torch.eq, torch.gt, torch.ge
- tensor의 값들이 같은지, 더 큰지, 이상인지를 판단하는 함수들이다
torch.allclose(input, other, trol, atol)
- input tensor와 other의 원소들의 차가 특정 범위인지를 판단하는 함수
  $$
  |\operatorname{input} - \operatorname{other}| \leq atol + rtol \times|other|
  $$

1 2	torch.allclose(torch.tensor([10.1, 1e-9]), torch.tensor([10.0, 1e-08])) # False

2.8 Other Operations

그 외 다양한 기능들이 모여있는 함수들
Other Operations - PyTorch 공식 문서

torch.einsum

Einstein Notation에 따라 연산을 진행하는 함수

Einstein Notation은 특정 index의 집합에 대한 합연산을 간결하게 표시하는 방법이다

x = torch.randn(5)
y = torch.randn(4)
torch.einsum('i,j->ij', x, y)
============================
output:
tensor([[ 0.1156, -0.2897, -0.3918,  0.4963],
        [-0.3744,  0.9381,  1.2685, -1.6070],
        [ 0.7208, -1.8058, -2.4419,  3.0936],
        [ 0.1713, -0.4291, -0.5802,  0.7350],
        [ 0.5704, -1.4290, -1.9323,  2.4480]])
==============================================
As = torch.randn(3,2,5)
Bs = torch.randn(3,5,4)
torch.einsum('bij,bjk->bik', As, Bs)
====================================
output:
tensor([[[-1.0564, -1.5904,  3.2023,  3.1271],
        [-1.6706, -0.8097, -0.8025, -2.1183]],

        [[ 4.2239,  0.3107, -0.5756, -0.2354],
        [-1.4558, -0.3460,  1.5087, -0.8530]],

        [[ 2.8153,  1.8787, -4.3839, -1.2112],
        [ 0.3728, -2.1131,  0.0921,  0.8305]]])

2.9 BLAS & LAPACK Ops

“BLAS” - Basic Linear Algebra Subprograms
“LAPACK” - Linear Algebra PACKage
선형대수에 관련된 함수군이다
BLAS & LAPACK Ops - PyTorch 공식 문서

2022-01-24 게시 됨2022-01-24 업데이트 됨boostcamp / week4분안에 읽기 (약 558 단어)

부스트 캠프 ai tech 2주 1일차 Pytorch (1)

0. pytorch란?

Meta(구 Facebook) 에서 개발한 딥러닝 프레임워크
numpy + AutoGradient
동적 그래프 기반

1. pytorch 기본

pytorch 에서는 Tensor class를 사용한다
Tensor
- numpy의 ndarray와 사실상 동일하다
  - 내장 함수도 대부분 비슷한 기능이 존재한다
- tensor가 가질수 있는 type은 ndarray와 동일하나 GPU 사용이 가능한 차이가 존재한다

1.1 기본 Tensor 함수

list > tensor

import torch
data = [[3, 5],[10, 5]]
x_data = torch.tensor(data)
##########################
output:
tensor([[ 3,  5],
        [10,  5]])

ndArray > tensor

nd_array_ex = np.array(data)
tensor_array = torch.from_numpy(nd_array_ex)
############################################
output:
tensor([[ 3,  5],
        [10,  5]])

tensor > ndarray

tensor_array.numpy()
####################
output:
array([[ 3,  5],
       [10,  5]])

flatten

data = [[3, 5, 20],[10, 5, 50], [1, 5, 10]]
x_data = torch.tensor(data)
x_data.flatten()
################
output:
tensor([ 3,  5, 20, 10,  5, 50,  1,  5, 10])

one_like

torch.ones_like(x_data)
#######################
output:
tensor([[1, 1, 1],
        [1, 1, 1],
        [1, 1, 1]])

shape, dtype

1 2	x_data.shape # torch.Size([3, 3]) x_data.dtype # torch.int64

GPU load

1
2
3

device = torch.device('cpu')
if torch.cuda.is_available():
    device = torch.device('cuda')

1.2 Tensor handling

view & reshape
- tensor의 shape를 변경하는 함수
- view는 input tensor와 return tensor가 데이터를 공유하여 항상 같은 주소값들을 가진다
- reshape은 tensor의 복사본 혹은 view를 반환한다
  - 원본과 동일한 tensor값이 필요할 경우에는 view를 사용하거나 clone()을 이용해야한다

squeeze & unsqueeze

차원의 개수가 1인 차원을 축소, 확장하는 함수

unsqueeze(index) : index에 1인 차원을 삽입해서 차원을 확장한다

tensor_ex = torch.rand(size=(2, 1, 2))
tensor_ex.squeeze().shape # torch.Size([2, 2])
tensor_ex = torch.rand(size=(2, 2))
tensor_ex.unsqueeze(0).shape # torch.Size([1, 2, 2])
tensor_ex = torch.rand(size=(2, 2))
tensor_ex.unsqueeze(1).shape # torch.Size([2, 1, 2])
tensor_ex = torch.rand(size=(2, 2))
tensor_ex.unsqueeze(2).shape # torch.Size([2, 2, 1])

1.3 Tensor operation

numpy와 동일하게 operation에 대해서 broadcasting을 지원한다
행렬곱셈 연산은 mm 및 matmul을 사용한다
- dot은 1차원 벡터와 스칼라 연산에서만 사용가능
- mm과 matmul은 2차원이상의 행렬연산에서만 사용가능
- mm은 broadcasting을 지원하지 않지만 matmul은 지원한다

1.4 Tensor operation for ML/DL formula

nn.functional을 이용한 다양한 연산가능
softmax, argmax, one_hot 등등

1.5 AutoGrad

자동 미분
tensor에 requires_grad=True로 설정해서 자동으로 gradient 추적이 가능하다
- 기본적으로 nn모듈의 선형연산들은 default로 True로 설정되어있어 잘 쓰지 않는다
  1
  tensor(data, requires_grad=True)
backward() 함수를 통하여 Backpropagation 수행