Categories
LLM

Fix RuntimeError: unscale_() has already been called on this optimizer since the last update().

When doing finetuning model using Lora and HuggingFace transformer, I received this error

RuntimeError: unscale_() has already been called on this optimizer since the last update().

This error because using the latest transformer version transformers-4.31.0.dev0. The solution is to revert back to transformers-4.30.2 with

pip install transformers-4.30.2
Categories
Machine Learning

Solve unsupported GNU version! gcc versions later than 11 are not supported!

When installing Python module like AutoGPTQ, you may got this errors

/usr/local/cuda/include/crt/host_config.h:132:2: error: #error -- unsupported GNU version! gcc versions later than 11 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
        132 | #error -- unsupported GNU version! gcc versions later than 11 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
            |  ^~~~~
      error: command '/usr/local/cuda/bin/nvcc' failed with exit code 1
      [end of output]

To solve this, we need to install GCC as following the maximum version

MAX_GCC_VERSION=11

sudo apt install gcc-$MAX_GCC_VERSION g++-$MAX_GCC_VERSION
sudo ln -s /usr/bin/gcc-11 /usr/local/cuda/bin/
Categories
Deep Learning

Increase Fan Speed Nvidia RTX 4090 to reduce high Temperature

RTX 4090 at Full Load under Machine Learning training can produce high-temperature heat. Its can go 80-85 degree celsius. Using big industrial fans to cooling the GPU and open the PC case can reduce to 70 Celcius.

However, before going to that path, you can adjust your NVIDIA GPU fans speed from 30% to 90% or even 100%. Here are the steps to do in in Ubuntu

First, you need to configure the X11

sudo vim /etc/X11/xorg.conf

Add add Option "Coolbits" "4" in the Section Device Nvidia

Section "Device"
     Identifier      "Device0"
     Driver          "nvidia"
     VendorName      "NVIDIA"
     Option          "Coolbits" "4"
EndSection

Reboot your PC to apply the new changes

The second steps, its to adjust its fans speed. I’m usually using Psensor to detect the fan speed. RTX 4090 have two fans, so you need to tuning both of them

Categories
Deep Learning

Solve ImportError: cannot import name ‘Linear8bitLt’ from quantization

When running –quantize llm.int8 in adapter for LitLLama, I got this error

ImportError: cannot import name 'Linear8bitLt' from 'lit_llama.quantization'

First step, we need to make sure if bitsandbytes is running well by

python -m bitsandbytes

And I received

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
...
packages/bitsandbytes/functional.py", line 12, in <module>
    from scipy.stats import norm
ModuleNotFoundError: No module named 'scipy'

Now, I know the problem is scipy is not installed. To solve this is installing scipy

pip install scipy

And I re-run again for bitsandbytes

Categories
Ubuntu

Solving File Folder Directories Suddenly Disappeared on Ubuntu

I’m using Ubuntu 23.04 Cinnamon the latest in 2023 and after moving files, suddenly the Nautilus explorer show error

Couldn't open file. No program to open the file

And suddenly the file, folder and all directories inside is gone.

If you have this problem, don’t panic. To solve this problem: Please reboot the ubuntu. Once you get login, you can go check on Trash and the file will be there!

Categories
Machine Learning

Install Transformers Pytorch Tensorflow Ubuntu 2023

To install transformers, Pytorch and Tensorflow works with GPU for the latest Ubuntu, several steps are required. This is how I successfully setup it and running several models with it.

Please make sure to install the latest NVIDIA drivers. I use RTX 4090 in this case. This is the link https://www.nvidia.com/download/driverResults.aspx/200481/en-us/

If you are using nouveau, you can disable it via

sudo bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
sudo bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"

sudo update-initramfs -u
sudo reboot
Categories
Machine Learning

Install Stable Difussion Automatic111, Torch 2.0 and Fix RTX 4090 Performance

I use a clean installation of Ubuntu 23.04 Lunar Lobster and Nvidia driver 525. If you already have the driver installed, here are the steps to improve Automatic111 Stable Diffusion performance to 40-44 it/s

  1. Install required Anaconda

Ubuntu 23.04 default Python version is 3.11 version. In this case, I will using Anaconda to provide Python 3.10. Download Anaconda and install

chmod a+x /Anaconda3-2023.03-1-Linux-x86_64.sh
./Anaconda3-2023.03-1-Linux-x86_64.sh 
Categories
Machine Learning

Solve Pandas Drop Duplicates still not unique in Value Counts

When using pandas drop duplicates, we may encountered rows that still have duplicating by checking via

df.column_name.value_counts()

Not sure why Pandas drop duplicates performance showing inconsistent result. However, to remove duplicate row, produce 100% unique based on index or key column, you can use this

df_unique = df_unique.drop(df_unique[df_unique["key_column_name"].duplicated()].index)
df_unique.temp_id.value_counts()
Categories
Machine Learning

Install LightGBM use GPU in Linux Ubuntu

LightGBM can work faster in GPU. In PyCaret, I’m passing parameter use_gpu=True in TSForecastingExperiment() and got errors:

[LightGBM] [Fatal] GPU Tree Learner was not enabled in this build.
Please recompile with CMake option -DUSE_GPU=1
[LightGBM] [Fatal] GPU Tree Learner was not enabled in this build.
Please recompile with CMake option -DUSE_GPU=1
[LightGBM] [Fatal] GPU Tree Learner was not enabled in this build.
Please recompile with CMake option -DUSE_GPU=1

To enable this, we need to uninstall the current LightGBM and re-install the LightGBM with GPU. For Linux Ubuntu, its better to install pre-requisite packages

sudo apt install cmake build-essential libboost-all-dev

Make sure you already have Nvidia Toolkit installed

sudo apt install nvidia-cuda-toolkit

The first option, is installation inside conda environment

pip uninstall lightgbm -y

conda install -c conda-forge gcc=12.1.0
pip install lightgbm --config-settings=cmake.define.USE_GPU=ON --config-settings=cmake.define.OpenCL_INCLUDE_DIR="/usr/local/cuda/include/" --config-settings=cmake.define.OpenCL_LIBRARY="/usr/local/cuda/lib64/libOpenCL.so"

Second options, installation without environment

# Get LightGBM source.
git clone --recursive https://github.com/Microsoft/LightGBM.git
cd LightGBM/python-package/
# cmake specifying locations of OpenCL files.
sudo cmake -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/local/cuda-8.0/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda-8.0/include/ ..
# Compile.
sudo make
# Install for Python, using what we just compiled.
python setup.py install --precompile
Categories
Windows

Install Stable Diffusion Windows and Fix RTX performance 2023

Many feedback about performance NVIDIA RTX after installing Stable Diffusion Automatic1111. I will explained a simple way to install and fix the RTX 4090 performance within 5 minutes

First, make sure you have Python 3.10 in your Windows. You can use Anaconda or native Python installation.

  1. Clone stable diffusion git repository to your local directory
https://github.com/AUTOMATIC1111/stable-diffusion-webui.git

2. Install Stable Diffusion with xformers

This part is tricky. By default, it will install Torch 2.1.0, however the latest xformers will required to use torch 2.0. Which later you will encountered the problems like :

AssertionError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

The solution for installation both xformers and torch inside stable difussion is to pass the arguments in installation

./webui.bat --xformers