[LLM 개발] Colab에서 허깅페이스 오픈 모델 사용

Llama-2 이후로 오픈 모델이 쏟아져 나오는데, 개인들은 이 모델들을 마땅히 테스트해보기도 쉽지 않다. 구글 코랩은 오래전부터 NVidia T4를 기본 GPU로 제공하고 있고, 허깅페이스에서 오픈 소스 모델들을 활용할 수 있는 라이브러리들을 제공하고 있어서 간단한 테스트 정도는 해 볼 수 있다. 참고로, Colab의 런타임을 T4로 바꿔줘야 한다.

특히, Llama 모델을 C 언어로 인퍼런스하도록 한 llama.cpp와 GGML 덕분에 경량화하여 사용할 수 있다. ctransformer는 허깅페이스에서 제공하는 것은 아니지만 GGML을 활용하여 transformer 계열 모델들을 사용할 수 있게 해 준 것으로 경량화된 LLM을 손쉽게 사용할 수 있도록 해 준다. 현시점에서 지원하는 언어모델은 아래와 같다.

GitHub - marella/ctransformers: Python bindings for the Transformer models implemented in C/C++ using GGML library.

Python bindings for the Transformer models implemented in C/C++ using GGML library. - GitHub - marella/ctransformers: Python bindings for the Transformer models implemented in C/C++ using GGML libr...

github.com

ctransformer는 허깅페이스의 transformer 패키지만큼 기능을 충분히 지원하고 있지 않아서 tokenizer와 pipeline은 transformer package를 사용한다.

> pip install ctransformers[cuda]
> pip install --upgrade git+https://github.com/huggingface/transformers

사용할 모델은 mistral-7b로 빅테크 출신의 엔지니어들이 유럽에서 만든 모델이다. 7b지만 Llama-2 13B 이상의 성능을 낸다고 알려져 있다.

from ctransformers import AutoModelForCausalLM

# gpu_layers는 속도 개선을 위해 50 정도로만 사용
model = AutoModelForCausalLM.from_pretrained(
    "TheBloke/Mistral-7B-Instruct-v0.1-GGUF",
    model_file="mistral-7b-instruct-v0.1.Q4_K_M.gguf",
    model_type="mistral",
    gpu_layers=50,
    hf=True
)

Transformer 패키지의 Tokernizer와 pipeline 라이브러리를 사용하여 모델 파이프라인 구성. 여기서는 Text Generation을 테스트한다.

from transformers import AutoTokenizer, pipeline

# Tokenizer
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")

# Pipeline
generator = pipeline(
    model=model, tokenizer=tokenizer,
    task='text-generation',
    max_new_tokens=200,
    repetition_penalty=1.1
)

프롬프트를 입력하여 검증한다.

prompt = """
The following is a conversation with an AI research assistant. 
The assistant tone is technical and scientific.
Human: Hello, who are you?
AI: Greeting! I am an AI research assistant. 
How can I help you today?
Human: Can you tell me about the creation of blackholes?
AI:
"""

response = generator(prompt)
print(response[0]["generated_text"])

Mistral은 아래와 같이 대답한다(아직 한글은 미지원). 대답(추론)은 14초 정도 소요되었다.

Black holes are formed when massive stars reach the end of their life cycle. 
When a star exhausts its nuclear fuel, it may explode into a supernova, 
which can then collapse under its own gravity to form a black hole. 
Black holes are characterized by their immense gravitational pull, 
which is so strong that not even light can escape from them. 
They are also known for their event horizon, which is the point of no return 
beyond which anything that crosses it cannot be retrieved. 
Black holes come in different sizes, ranging from stellar-mass black holes, 
which are formed from the collapse of a single star, to supermassive black holes, 
which are believed to exist at the center of most galaxies, including our own Milky Way.

#한글은 DeepL로 번역
블랙홀은 거대한 별이 수명 주기가 끝날 때 형성됩니다. 
별이 핵연료를 다 소진하면 초신성으로 폭발할 수 있습니다, 
그러면 자체 중력에 의해 붕괴되어 블랙홀이 형성될 수 있습니다. 
블랙홀의 특징은 엄청난 중력입니다, 
빛조차도 빠져나갈 수 없을 정도로 강력합니다. 
블랙홀은 또한 사건의 지평선(사건의 지평선)으로도 알려져 있는데, 이는 돌아올 수 없는 지점입니다. 
그 너머로 넘어가는 것은 아무것도 회수할 수 없습니다. 
블랙홀은 항성 질량 블랙홀부터 별 질량 블랙홀까지 다양한 크기로 존재합니다, 
단일 별의 붕괴로 형성되는 항성 질량 블랙홀부터 초질량 블랙홀까지 다양합니다, 
우리 은하를 포함한 대부분의 은하 중심에 존재하는 것으로 추정됩니다.

저작자표시

'AI 빅데이터 > AI 동향' 카테고리의 다른 글

[AI가속] TensorRT-LLM (0)	2023.10.25
[Inference] WebGPU (0)	2023.10.25
[LangChain] VertexAI의 LLM으로 Web검색 연동하기 (1)	2023.08.24
[PaLM-2] VertexAI와 Neo4j로 영화 추천하기 (0)	2023.08.15
[LangChain] Vertex AI의 PaLM으로 LangChain QA하기 (0)	2023.06.25

마고커

[LLM 개발] Colab에서 허깅페이스 오픈 모델 사용

'AI 빅데이터 > AI 동향' 카테고리의 다른 글

댓글

티스토리툴바

[LLM 개발] Colab에서 허깅페이스 오픈 모델 사용

'AI 빅데이터 > AI 동향' 카테고리의 다른 글

관련글

댓글

티스토리툴바