TIFF 파일용 LZW 사용 요구 사항

TIFF 버전 6 표준 문서에서는 섹션 13의 LZW 사용에 대해 다음과 같이 설명합니다:

각 스트립의 데이터는 독립적으로 압축되며 사전 테이블은 각 스트립마다 다릅니다. 압축하기 전에 각 스트립에 약 8K 바이트가 포함되도록 RowsPerStrip 옵션을 설정하는 것이 좋습니다. 이 옵션의 주된 목적은 압축된 스트립 데이터와 압축되지 않은 스트립 데이터를 모두 메모리에 보관할 수 있도록 스트립을 충분히 작게 만들고 콘텐츠가 적은 시스템에서도 거의 최적의 압축률을 유지할 수 있도록 하기 위한 것입니다.
TIFF는 최소 길이 9, 최대 길이 12의 가변 길이 인코딩을 사용하여 LZW 알고리즘을 구현합니다. 각 바이트는 총 8비트를 차지하며 0에서 255까지의 정수 범위에 해당합니다. 256과 257은 각각 두 개의 특수 문자, CLEAR_CODE와 EOI_CODE에 대응하도록 TIFF 파일에 지정되며, 이 중 CLEAR_CODE는 압축 또는 압축 해제 중에 확장 사전의 용량이 2^12=4096을 초과하면 프로그램에 사전 테이블을 다시 초기화하도록 알리는 데 사용되고 각 스트립의 압축 결과는 CLEAR_CODE로 시작됩니다. 압축 또는 압축 해제 중에 확장 사전의 용량이 2^12=4096을 초과할 때 프로그램에 사전 테이블을 다시 초기화하도록 지시하는 데 사용되며, 각 스트립의 압축 결과는 CLEAR_CODE로 시작하고 각 스트립의 압축 결과 끝에 현재 스트립의 데이터가 압축되었음을 나타내는 EOI_CODE가 배치됩니다.
압축 결과는 바이트 단위로 기록되므로 작은 엔디안 바이트 순서 파일이든 큰 엔디안 바이트 순서 파일이든 압축된 데이터는 동일합니다.

미리 알아야 할 사항

TIFF 파일용 LZW 구현 3.

난이도 분석

하나는 인코딩을 효율적으로 관리하는 방법이고, 다른 하나는 인코딩 결과를 출력 결과에 효율적으로 작성하고 가변 길이 인코딩 결과에 대해 올바르게 변환하는 방법입니다.

코딩을 효과적으로 관리하는 방법

가장 기본적인 LZW 구현에는 수십 줄의 코드만 필요하며, 자세한 내용은 다른 글을 참조하세요. 진짜 어려움은 코드 테이블을 효율적으로 관리하는 방법입니다. 철저한 방법은 더 많은 메모리를 요구하고 프로그램 실행 속도가 느려집니다. 상용 LZW 프로그램에서는 일반적으로 성능을 개선하기 위해 몇 가지 방법이 사용됩니다. 예를 들어, 각 문자열이 인코딩되는 길이를 미리 알 수 없기 때문에 메모리 문제가 발생합니다. 대부분의 LZW 프로그램은 이 문제를 해결하기 위해 인코딩 테이블의 중복 특성을 활용합니다. 예를 들어, 다음 문서의 인코딩 예시에서 2단계 인코딩 256은 TO로 정의되고 12단계 인코딩 265는 TOB로 위치하며, 비유하자면 각 인코딩은 이전 인코딩에 새로운 문자를 더한 것으로 볼 수 있습니다.

0102 | LZW 압축 알고리즘

무기한 길이 인코딩 결과를 처리하는 방법

무한 길이 인코딩 결과를 바이트 단위로 변환하여 저장하는 것은 LZW 알고리즘의 압축률을 달성하기 위한 핵심 포인트 중 하나입니다.

예를 들어 [256, 7, 258, 8, 8, 258, 6, 6, 257] 데이터의 경우 int 타입으로 저장할 경우 총 9*4=36바이트가 필요합니다. 다른 방식으로 보면 현재 데이터 총 최대값이 258이므로 9비트 바이너리를 사용하여 000000111,100000010,000001000,000001000,100000010,000000110,000000110,001000001에 대한 이진 변환 결과를 표현하는 것을 고려할 수 있습니다. 그런 다음 변환 결과를 바이트 저장소로 변환하면 -128(1000000), 1, -32, 64, -128, 68, 8, 12, 6, -128, -128 결과에 해당하는 총 11바이트가 필요합니다.

사전 테이블이 확장됨에 따라 각 인코딩에 해당하는 비트의 길이가 증가하는데, 예를 들어 256-511 사이의 데이터는 9비트 바이너리로 표현할 수 있지만 512-1023 사이의 데이터는 10비트 바이너리를 사용하여 표현해야 합니다. 이를 처리하기 위해 비트 시프트 연산과 몇 가지 중간 변수를 사용하여 처리할 수 있습니다.

압축 프로세스

데이터 인코딩

키 코드:

public void encode(final ByteBuffer byteBuffer) throws IOException {
 if (!byteBuffer.hasRemaining()) return ;
 // CLEAR에 쓰기_CODE
 this.writeCode(CLEAR_CODE);
 this.parent = this.byte2int(byteBuffer.get());
 while (byteBuffer.hasRemaining()) {
 int current = this.byte2int(byteBuffer.get());
 int child = children[this.parent];
 if (child > 0) {
 if (suffixes[child] == current) {
 this.parent = child;
 } else {
 int sibling = child;
 while (true) {
 if (siblings[sibling] > 0) {
 sibling = siblings[sibling];
 if (suffixes[sibling] == current) {
 this.parent = sibling;
 break;
 }
 } else {
 siblings[sibling] = (short) this.nextValidCode;
 suffixes[this.nextValidCode] = (short) current;
 this.writeCode(this.parent);
 this.parent = current;
 this.nextValidCode++;
 break;
 }
 }
 }
 } else {
 children[parent] = (short) nextValidCode;
 suffixes[nextValidCode] = (short) current;
 writeCode(parent);
 parent = current;
 nextValidCode++;
 }
 }
}

작동 예: 입력 데이터: [7, 7, 7, 7, 7, 8, 8, 8, 7, 7, 7, 6, 6]

 
기본 데이터: 어린이=[](빈 배열), 접미사=[](빈 배열), 형제=[](빈 배열), 부모=7다음 유효 코드= 258
프로세스를 실행합니다:
-- current = 7
-- child = children[parent] = children[7] = 0
-- child == 0
---- children[parent] = nextValidCode = 258
---- suffixes[nextValidCode] = current = 7 (이전 단계와 함께 Ω을 넣는 것과 유사한 기능을 수행합니다.+ [77](사전에 쓰는 단계)
---- parent = current = 7
---- nextValidCode += 1 = 259
---- 출력 부모=7

 
기본 데이터: 어린이=[](7:258), 접미사=[](258:7)형제=[](빈 배열), 부모=7다음 유효 코드= 259
프로세스를 실행합니다:
-- current = 7
-- child = children[parent] = children[7] = 258
-- child > 0
---- suffixes[child] = suffixes = 7 == current
------ parent = child = 258

 
기본 데이터: 어린이=[](7:258), 접미사=[](258:7)형제=[](빈 배열), 부모=258다음 유효 코드= 259
프로세스를 실행합니다:
-- current = 8
-- child = children[parent] = children = 0
-- child == 0
---- children[parent] = children = nextValidCode = 259
---- suffixes[nextValidCode] = suffixes = current = 8 (이전 단계와 함께 Ω을 넣는 것과 유사한 기능을 수행합니다.+(λ를 사전으로 작성하는 단계)
---- parent = current = 8
---- nextValidCode += 1 = 260
---- 출력 부모=258

 
기본 데이터: 어린이=[](7:258258:259), 접미사=[](258:7259:8)형제=[](빈 배열), 부모=8다음 유효 코드= 260
프로세스를 실행합니다:
-- current = 8
-- child = children[parent] = children[8] = 0
-- child == 0
---- children[parent] = children[8] = nextValidCode = 260
---- suffixes[nextValidCode] = suffixes = current = 8 (이전 단계와 함께 수행하면 Ω+ [88](사전에 쓰는 단계)
---- parent = current = 8
---- nextValidCode += 1 = 261
---- 출력 부모=8

 
기본 데이터: 어린이=[](7:258 :260258:259), 접미사=[](258:7259:8260:8)형제=[](빈 배열), 부모=8다음 유효 코드= 261
프로세스를 실행합니다:
-- current = 7
-- child = children[parent] = children[8] = 260
-- child != 0
---- suffixes[child] = suffixes = 8 != current = 7
------ sibling = child = 258
------ siblings[sibling] = siblings == 0
-------- siblings[sibling] = siblings = nextValidCode = 261
-------- suffixes[nextValidCode] = suffixes = current = 7
-------- parent = current = 7
-------- nextValidCode += 1 = 262
-------- 출력 부모=8

 
기본 데이터: 어린이=[](7:258 :260258:259), 접미사=[](258:7259:8260:8261:7)형제=[](260:261), 부모=7다음 유효 코드= 262
프로세스를 실행합니다:
-- current = 7
-- child = children[parent] = children[7] = 258
-- child != 0
---- suffixes[child] = suffixes = 7 = current
------ parent = 258

 
기본 데이터: 어린이=[](7:258 :260258:259), 접미사=[](258:7259:8260:8261:7)형제=[](260:261), 부모=258다음 유효 코드= 262
프로세스를 실행합니다:
-- current = 6
-- child = children[parent] = children = 259
-- child != 0
---- suffixes[child] = suffixes = 8 != current = 6
------ sibling = child = 259
------ siblings[sibling] = siblings = 0
-------- siblings[sibling] = siblings = nextValidCode = 262
-------- suffixes[nextValidCode] = suffixes = current = 6
-------- parent = current = 6
-------- nextValidCode += 1 = 263
-------- 출력 부모=258

 
기본 데이터: 어린이=[](7:258 :260258:259), 접미사=[](258:7259:8260:8261:7262:6)형제=[](259:262260:261), 부모=6다음 유효 코드= 263
프로세스를 실행합니다:
-- current = 6
-- child = children[parent] = children[6] = 0
-- child == 0
---- children[parent] = children[6] = nextValidCode = 263
---- suffixes[nextValidCode] = suffixes = current = 6
---- parent = current = 6
---- nextValidCode += 1 = 264
---- 출력 부모=6

 
기본 데이터: 어린이=[](7:258 :260258:259), 접미사=[](258:7259:8260:8261:7262:6)형제=[](259:262260:261), 부모=6다음 유효 코드= 263
프로세스를 실행합니다:
-- 출력 부모=6

무기한 길이 인코딩 결과를 바이트로 변환하기

키 코드

비트포즈오브다음코드: 다음 인코딩 결과에서 바이트에 쓰기 시작할 비트의 위치 비트프롬이전코드: 이전 인코딩 결과에서 바이트에 기록할 내용

private void writeCode(final int code) throws IOException {
 // 이전 인코딩 결과에서 남은 비트와 현재 값을 결합하는 중간 변수입니다.
 int var1 = (this.bitsFromPreviousCode << this.bitsPerCode) | (code & this.maxCode);
 // 중간 변수, 이전 인코딩 결과에서 남은 비트 길이와 현재 인코딩 결과에 해당하는 비트 길이의 합입니다.
 int var2 = this.bitPosOfNextCode + this.bitsPerCode;
 while (var2 >= 8) {
 // 병합된 비트의 길이가 8보다 큰 경우, 처음 8비트의 데이터는 오른쪽 시프트 연산으로 잘립니다.
 // 0xff에는 두 가지 주요 기능이 있습니다. 하나는 음수를 정수로 변환하는 것이고, 다른 하나는 마지막 악의 사이클에서 var2가 8의 배수일 때 데이터의 마지막 8비트를 가로채는 것입니다.
 int var3 = (var1 >> (var2 - 8)) & 0xff;
 // int에 쓸 때 출력 스트림이 자동으로 바이트 단위로 변환됩니다.
 this.codeStream.write(var3);
 var2 -= 8;
 }
 // 비트포즈오브다음코드 및 비트프롬이전코드 업데이트
 this.bitPosOfNextCode = var2;
 this.bitsFromPreviousCode = var1 & this.bitmaskFor(this.bitPosOfNextCode);
}

연산 예: 입력 데이터 [256, 7, 258, 8, 8, 258, 6, 6, 257], 해당 출력 결과는 [-128, 1, -32, 64, -128, 68, 8, 12, 6, -128, -128]이 됩니다.

파생 프로세스의 처음 두 단계는 여기에 나열되어 있으며 여기서 반복되지 않는 나머지 단계는 이해 당사자가 직접 추론 할 수 있습니다.

 
기본 데이터: bitsFromPreviousCode=0비트포스오브넥스트코드=0MaxCode=511
프로세스를 실행합니다:
-- code = 256
-- var1 = (bitsFromPreviousCode << bitsPerCode) | (code & maxCode) = (0 << 9) | (256 & 511) = (0 << 9) | (100000000 & 111111111) = 256
-- var2 = bitPosOfNextCode + bitsPerCode = 0 + 9 = 9
-- var2 >= 8
---- var3 = (var1 >> (var2 - 8)) & 0xff = (256 >> 1) & 0xff = (100000000 >> 1) & 255 = 10000000 & 11111111 = 128
---- 128을 쓰면 출력 스트림이 -128에 해당하는 바이트 단위로 자동 변환됩니다.
---- var2 -= 8 = 1
-- bitPosOfNextCode = var2 = 1
-- bitsFromPreviousCode = var1 & bitmaskFor(1) = 256 & 0 = 100000000 & 0 = 0

 
기본 데이터: bitsFromPreviousCode=0비트포스오브넥스트코드=1MaxCode=511
프로세스를 실행합니다:
-- code = 7
-- var1 = (bitsFromPreviousCode << bitsPerCode) | (code & maxCode) = (0 << 9) | (7 & 511) = 7
-- var2 = bitPosOfNextCode + bitsPerCode = 1 + 9 = 10
-- var2 >= 8
---- var3 = (var1 >> (var2 - 8)) & 0xff = (7 >> 2) & 255 = (111 >> 2) & 11111111 = 1 & 11111111 = 1
---- 1을 쓰면 출력 스트림이 1에 해당하는 바이트 단위로 자동 변환됩니다.
---- var2 -= 8 = 2
-- bitPosOfNextCode = var2 = 2
-- bitsFromPreviousCode = var1 & bitmaskFor(2) = 111 & 1 = 111 & 11 = 3

압축 해제 프로세스

데이터 디코딩

키 코드

public OutputStream decode(final InputStream inputStream) throws IOException {
 try (FastByteArrayOutputStream byteArrayOutputStream = new FastByteArrayOutputStream()) {
 // 관련 변수 초기화
 this.initialize();
 // 첫 번째 인코딩 결과를 읽고, 끝 마커가 아닌 경우 끝 마커를 찾을 때까지 데이터를 반복합니다.
 int code = this.readLzwCode(inputStream);
 while (code != EOI_CODE) {
 if (code == CLEAR_CODE) {
 // CLEAR_CODE
 // 1. 사전 테이블 및 관련 데이터 재초기화
 this.initialize();
 // 2. CLEAR가 아닌 것을 발견할 때까지 뒷부분을 읽습니다._CODE및 EOI_CODE 
 code = this.readLzwCode(inputStream);
 while (code == CLEAR_CODE) {
 code = this.readLzwCode(inputStream);
 }
 if (code == EOI_CODE) {
 break;
 }
 // 3. CLEAR_CODE그 이후의 첫 번째 데이터는 데이터 테이블에 있어야 하며, 다음과 같이 간단하게
 List<Integer> value = this.dictionary.get(code);
 this.writeToOutput(value, byteArrayOutputStream);
 this.previousCode = code;
 } else {
 // 사전 테이블에 존재하는 경우 해당 코드에 해당하는 값을 작성한 다음 사전 테이블에서 다음과 같이 따라갑니다.<코드 값: Ω+ [0]>같은 방식으로 새 값을 추가합니다.
 List<Integer> value = this.dictionary.get(code);
 if (value != null) {
 this.writeToOutput(value, byteArrayOutputStream);
 List<Integer> newValue = this.concat(this.dictionary.get(this.previousCode), value.get(0));
 this.addToDictionary(newValue);
 this.previousCode = code;
 } else {
 // 사전 표에 없는 경우
 // 1. 새 값 생성 및 쓰기
 List<Integer> previousValue = this.dictionary.get(this.previousCode);
 List<Integer> newValue = this.concat(previousValue, previousValue.get(0));
 this.writeToOutput(newValue, byteArrayOutputStream);
 // 2. 사전 테이블에 새 코드 추가
 this.addToDictionary(newValue);
 this.previousCode = code;
 }
 }
 code = this.readLzwCode(inputStream);
 }
 return byteArrayOutputStream;
 }
}

연산 예: 입력 데이터 [-128, 1, -32, 64, -128, 68, 8, 12, 6, -128, -128], 출력 결과는 [7, 7, 7, 7, 8, 8, 7, 7, 7, 6, 6]이 되어야 합니다.

도출 프로세스의 처음 3단계는 여기에 나열되어 있으며, 여기서 반복하지 않을 나머지 단계는 이해 당사자가 직접 추론할 수 있습니다.

 
기본 데이터: 사전={}이전 코드=0
프로세스를 실행합니다:
-- code = 256
-- code == CLEAR_CODE
---- 사전 테이블 초기화
---- code = 7
---- 사전의 코드에 해당하는 값을 작성합니다.[7]
---- previousCode = 7
-- code = 258

 
기본 데이터: 사전=<0-257>이전 코드=7다음 유효 코드=258
프로세스를 실행합니다:
-- code = 258
-- code != EOI_CODE != CLEAR_CODE
-- dictionary[code] = dictionary = null
---- previousValue = dictionary[previouse] = dictionary[7] = [7]
---- newValue = previousValue + previousValue[0] = [77]
----  [77]
---- 사전의 새 레코드, 키=258값은 다음과 같습니다.[77]
------ nextValidCode += 1 = 259 < maxCode = 511，비츠퍼코드 변경 필요 없음
---- previousValue = 258
-- code = 8

 
기본 데이터: 사전=<0-258>이전 코드=258다음 유효 코드=259
프로세스를 실행합니다:
-- code = 8
-- code != EOI_CODE != CLEAR_CODE
-- dictionary[code] = dictionary[8] = [8] != null
----  [8]
---- newValue = dictionary[previous] + dictionary[8][0] = 
---- 사전의 새 레코드, 키=259이 값은 다음 값과 같습니다.
------ nextValidCode += 1 = 260 < maxCode = 511，비츠퍼코드 변경 필요 없음
---- previousValue = 8
-- code = 8

바이트에서 LZW로 인코딩 결과

키 코드

private int readLzwCode(final InputStream inputStream) throws IOException {
 int read = inputStream.read();
 // 현재 입력 스트림에 읽을 수 있는 데이터가 없는 경우, EOI를 직접 반환합니다._CODE
 if (read < 0) {
 return EOI_CODE;
 }
 // 마지막 바이트의 결과를 8비트씩 이동하여 현재 값과 병합합니다.
 int var1 = (this.remainValueOfPreviousByte << 8) | read;
 // 이미 읽은 바이트 수: 이전 바이트에서 남은 비트 수입니다.+ 현재 바이트의 비트 8
 int var2 = this.bitsOfPreviousByte + 8;
 // 이미 읽은 바이트 비트 수가 인코딩된 결과에 해당하는 비트 수보다 여전히 적은 경우 다음 값을 읽어야 합니다.
 // bitsPerCode최대값은 12이며, 다음 조건이 두 번 이상 실행되므로 동안 루프가 필요하지 않습니다.
 if (var2 < this.bitsPerCode) {
 read = inputStream.read();
 if (read < 0) {
 return EOI_CODE;
 }
 // 이미 읽은 값은 8비트 왼쪽으로 이동하여 새로 읽은 값과 병합됩니다.
 var1 = (var1 << 8) | read;
 // 이미 읽은 바이트 수+8
 var2 += 8;
 }
 // 읽을 값에서 제외할 비트 수입니다.
 int var3 = var2 - this.bitsPerCode;
 int code = (var1 >> var3) & this.bitmaskFor(this.bitsPerCode);
 // 마스킹하여 다음 인코딩 결과에 넣어야 하는 읽기 바이트의 값을 가져옵니다.
 this.remainValueOfPreviousByte = var1 & this.bitmaskFor(var3);
 this.bitsOfPreviousByte = var3;
 return code;
}

연산 예: 입력 데이터 [-128, 1, -32, 64, -128, 68, 8, 12, 6, -128, -128], 해당 결과는 [256, 7, 258, 8, 8, 258, 6, 6, 257]입니다.

파생 프로세스의 처음 두 단계는 여기에 나열되어 있으며 여기서 반복되지 않는 나머지 단계는 이해 당사자가 직접 추론 할 수 있습니다.

 
기본 데이터: bitsOfPreviousByte=0남아 있는 값, 이전 바이트, 코드당 비트 수=9
프로세스를 실행합니다:
-- read = 128(10000000)
-- var1 = (remainValueOfPreviousByte << 8) | read = (0 << 8) | 128 = 128
-- var2 = bitsOfPreviousByte + 8 = 8
-- var2 < 9
---- read = 1(00000001)
---- var1 = (var1 << 8) | read = 32769(1000000000000001)
---- var2 += 8 = 16
-- var3 = var2 - bitsPerCode = 16 - 9 = 7
-- code = (var1 >> var3) & bitmaskFor(bitsPerCode) = (32769 >> 7) & 511 = 100000000 & 111111111 = 100000000 = 256
-- bitsOfPreviousByte = var3 = 7
-- remainValueOfPreviousByte = var1 & bitmaskFor(var3) = 32769 & 127 = 1000000000000001 & 1111111 = 1

 
기본 데이터: bitsOfPreviousByte=7, remainValueOfPreviousByte=1압축 풀기, 코드당 비트 수=9
프로세스를 실행합니다:
-- read = 224
-- var1 = (remainValueOfPreviousByte << 8) | read = (1 << 8) | 224 = 480
-- var2 = bitsOfPreviousByte + 8 = 15
-- var3 = var2 - bitsPerCode = 15 - 9 = 6
-- code = (var1 >> var3) & bitmaskFor(bitsPerCode) = (480 >> 6) & 511 = (111100000 >> 6) & 111111111 = 111 & 111111111 = 7
-- bitsOfPreviousByte = var3 = 6
-- remainValueOfPreviousByte = var1 & bitmaskFor(var3) = 480 & 63 = 111100000 & 111111 = 32

전체 코드

압축

package.funnymap.compression.lzw;
import.funnymap.compression.Encoder;
import org.springframework.util.FastByteArrayOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.ByteBuffer;
import java.util.Arrays;
public class LZWEncoder implements Encoder {
 // 상수 초기화
 private static final int CLEAR_CODE = 256;
 private static final int EOI_CODE = 257;
 private static final int MIN_BIT_SIZE = 9;
 private static final int MAX_BIT_SIZE = 12;
 private static final int MAX_TABLE_SIZE = 1 << MAX_BIT_SIZE;
 // 초기화 설정
 private final OutputStream codeStream;
 // 코딩 과정에서 다음 구성이 사용됩니다.
 private final short[] children = new short[MAX_TABLE_SIZE]; // 사전에 존재하는 입력 데이터에 해당하는 키 값에 해당합니다.
 private final short[] suffixes = new short[MAX_TABLE_SIZE]; //  +λ의 마지막 값
 private final short[] siblings = new short[MAX_TABLE_SIZE]; //  +람다의 형제
 private int bitsPerCode = MIN_BIT_SIZE; // 인코딩 결과의 각 비트 길이를 나타내며 최소값은 9, 최대값은 12입니다.
 private int nextValidCode = EOI_CODE + 1; // 사전의 새 인코딩 결과 값입니다.
 private int maxCode = this.maxValueOf(bitsPerCode); // 현재 비트 번호로 표현할 수 있는 최대값입니다.
 // 결과를 바이트 단위로 인코딩할 때 사용되는 구성은 다음과 같습니다.
 private int bitPosOfNextCode = 0; // 다음 인코딩 결과에서 바이트에 쓰기를 시작할 비트의 위치입니다.
 private int bitsFromPreviousCode = 0; // 이전 인코딩 결과는 다음에 기록할 바이트가 됩니다.
 public LZWEncoder(final OutputStream codeStream) {
 this.codeStream = codeStream;
 }
 @Override
 public void encode(final ByteBuffer byteBuffer) throws IOException {
 if (!byteBuffer.hasRemaining()) return ;
 // CLEAR에 쓰기_CODE
 this.writeCode(CLEAR_CODE);
 // 입력 데이터 인코딩 및 쓰기
 this.encodeBytes(byteBuffer);
 // EOI에 쓰기_CODE
 this.writeCode(EOI_CODE);
 // 인코딩된 데이터의 마지막 남은 부분이 0인 경우 나머지 비트를 0으로 작성하여 채웁니다.
 if (this.bitPosOfNextCode > 0) {
 this.writeCode(0);
 }
 }
 @SuppressWarnings("java:S3776")
 private void encodeBytes(final ByteBuffer byteBuffer) throws IOException {
 // 코딩 프로세스
 int parent = this.byte2int(byteBuffer.get());
 while (byteBuffer.hasRemaining()) {
 int current = this.byte2int(byteBuffer.get());
 int child = this.children[parent];
 if (child > 0) {
 if (this.suffixes[child] == current) {
 parent = child;
 } else {
 int sibling = child;
 while (true) {
 if (this.siblings[sibling] > 0) {
 sibling = this.siblings[sibling];
 if (this.suffixes[sibling] == current) {
 parent = sibling;
 break;
 }
 } else {
 this.siblings[sibling] = (short) this.nextValidCode;
 this.suffixes[this.nextValidCode] = (short) current;
 this.writeCode(parent);
 parent = current;
 this.nextValidCode++;
 this.increaseBitsPerCodeOrRestIfNeeded();
 break;
 }
 }
 }
 } else {
 this.children[parent] = (short) this.nextValidCode;
 this.suffixes[this.nextValidCode] = (short) current;
 writeCode(parent);
 parent = current;
 this.nextValidCode++;
 this.increaseBitsPerCodeOrRestIfNeeded();
 }
 }
 // 마지막 데이터의 경우
 this.writeCode(parent);
 }
 private int byte2int(byte byteValue) {
 return byteValue & 0xff;
 }
 private void increaseBitsPerCodeOrRestIfNeeded() throws IOException {
 if (this.nextValidCode >= this.maxCode) {
 if (this.bitsPerCode == MAX_BIT_SIZE) {
 // 출력 스트림에 CLEAR 추가_CODE  
 this.writeCode(CLEAR_CODE);
 // 테이블 재설정
 this.resetTable();
 } else {
 this.bitsPerCode += 1;
 this.maxCode = this.maxValueOf(bitsPerCode);
 }
 }
 }
 private void resetTable() {
 Arrays.fill(this.suffixes, (short) 0);
 Arrays.fill(this.children, (short) 0);
 Arrays.fill(this.siblings, (short) 0);
 this.bitsPerCode = MIN_BIT_SIZE;
 this.maxCode = this.maxValueOf(this.bitsPerCode);
 this.nextValidCode = EOI_CODE + 1;
 }
 /**
 * 인코딩 결과를 출력 스트림에 쓰기
 *
 * @param code 인코딩 결과
 * @throws IOException 인코딩 결과를 출력 스트림에 쓸 때 발생하는 예외
 */
 private void writeCode(final int code) throws IOException {
 // 이전 인코딩 결과에서 남은 비트와 현재 값을 결합하는 중간 변수입니다.
 int var1 = (this.bitsFromPreviousCode << this.bitsPerCode) | (code & this.maxCode);
 // 중간 변수, 이전 인코딩 결과에서 남은 비트 길이와 현재 인코딩 결과에 해당하는 비트 길이의 합입니다.
 int var2 = this.bitPosOfNextCode + this.bitsPerCode;
 while (var2 >= 8) {
 // 병합된 비트의 길이가 8보다 큰 경우, 처음 8비트의 데이터는 오른쪽 시프트 연산으로 잘립니다.
 // 0xff에는 두 가지 주요 기능이 있습니다. 하나는 음수를 정수로 변환하는 것이고, 다른 하나는 마지막 악의 사이클에서 var2가 8의 배수일 때 데이터의 마지막 8비트를 가로채는 것입니다.
 int var3 = (var1 >> (var2 - 8)) & 0xff;
 // int에 쓸 때 출력 스트림이 자동으로 바이트 단위로 변환됩니다.
 this.codeStream.write(var3);
 var2 -= 8;
 }
 // 비트포즈오브다음코드 및 비트프롬이전코드 업데이트
 this.bitPosOfNextCode = var2;
 this.bitsFromPreviousCode = var1 & this.bitmaskFor(this.bitPosOfNextCode);
 }
 /**
 * 지정된 값에 해당하는 마스크를 가져옵니다.
 *
 * @param value 특정 값
 * @return  
 */
 private int bitmaskFor(int value) {
 return this.maxValueOf(value);
 }
 /**
 * 지정된 비트 수의 최대값을 반환합니다.
 *
 * @param bitsLength bit 
 * @return 현재 비트 번호로 표현할 수 있는 최대값입니다.
 */
 private int maxValueOf(int bitsLength) {
 return (1 << bitsLength) - 1;
 }
 public static void main(String[] args) throws IOException {
 FastByteArrayOutputStream outputStream = new FastByteArrayOutputStream();
 LZWEncoder lzwEncoder = new LZWEncoder(outputStream);
 // 1. writeCode메서드 테스트
 // 해당 출력은 다음과 같습니다.[-128, 1, -32, 64, -128, 68, 8, 12, 6, -128, -128]
 int[] exampleValues = new int[] {256, 7, 258, 8, 8, 258, 6, 6, 257};
 for (int value : exampleValues) {
 lzwEncoder.writeCode(value);
 }
 if (lzwEncoder.bitPosOfNextCode != 0) {
 lzwEncoder.writeCode(0);
 }
 System.out.println(Arrays.toString(outputStream.toByteArray()));
 // 2. encode메서드 테스트
 // 해당 출력은 다음과 같습니다.[-128, 1, -32, 64, -128, 68, 8, 12, 6, -128, -128]
 byte[] bytes = new byte[] {7, 7, 7, 8, 8, 7, 7, 6, 6};
 ByteBuffer byteBuffer = ByteBuffer.wrap(bytes);
 lzwEncoder.encode(byteBuffer);
 System.out.println(Arrays.toString(outputStream.toByteArray()));
 }
}

압축 해제

package.funnymap.compression.lzw;
import org.springframework.util.FastByteArrayOutputStream;
import java.io.*;
import java.util.*;
public class LZWDecoder {
 // 상수 초기화
 private static final int CLEAR_CODE = 256;
 private static final int EOI_CODE = 257;
 private static final int MIN_BIT_SIZE = 9;
 // 초기화 설정
 // 디코딩 프로세스 중에 다음 구성이 사용됩니다.
 private int nextValidCode = EOI_CODE + 1; // 사전에서 새 인코딩 결과의 값입니다.
 private int bitsPerCode = MIN_BIT_SIZE; // 인코딩 결과의 각 비트 길이를 나타내며 최소값은 9, 최대값은 12입니다.
 private int maxCode = this.maxValueOf(bitsPerCode); // 현재 비트 번호로 표현할 수 있는 최대값입니다.
 private int previousCode = 0;
 private Map<Integer, List<Integer>> dictionary; // 디코딩을 위한 사전 테이블
 // 인코딩된 결과를 바이트 단위로 변환할 때 사용되는 구성은 다음과 같습니다.
 private int bitsOfPreviousByte = 0; // 현재 인코딩에 넣어야 하는 이전 바이트의 비트 수입니다.
 private int remainValueOfPreviousByte; // 현재 인코딩된 비트의 값을 이전 바이트에 배치해야 합니다.
 public OutputStream decode(final InputStream inputStream) throws IOException {
 try (FastByteArrayOutputStream byteArrayOutputStream = new FastByteArrayOutputStream()) {
 // 관련 변수 초기화
 this.initialize();
 // 첫 번째 인코딩 결과를 읽고, 끝 마커가 아닌 경우 끝 마커를 찾을 때까지 데이터를 반복합니다.
 int code = this.readLzwCode(inputStream);
 while (code != EOI_CODE) {
 if (code == CLEAR_CODE) {
 // CLEAR_CODE
 // 1. 사전 테이블 및 관련 데이터 재초기화
 this.initialize();
 // 2. CLEAR가 아닌 것을 발견할 때까지 뒷부분을 읽습니다._CODE및 EOI_CODE 
 code = this.readLzwCode(inputStream);
 while (code == CLEAR_CODE) {
 code = this.readLzwCode(inputStream);
 }
 if (code == EOI_CODE) {
 break;
 }
 // 3. CLEAR_CODE그 이후의 첫 번째 데이터는 데이터 테이블에 있어야 하며, 다음과 같이 간단하게
 List<Integer> value = this.dictionary.get(code);
 this.writeToOutput(value, byteArrayOutputStream);
 this.previousCode = code;
 } else {
 // 사전 테이블에 존재하는 경우 해당 코드에 해당하는 값을 작성한 다음 사전 테이블에서 다음과 같이 따라갑니다.<코드 값: Ω+ [0]>같은 방식으로 새 값을 추가합니다.
 List<Integer> value = this.dictionary.get(code);
 if (value != null) {
 this.writeToOutput(value, byteArrayOutputStream);
 List<Integer> newValue = this.concat(this.dictionary.get(this.previousCode), value.get(0));
 this.addToDictionary(newValue);
 this.previousCode = code;
 } else {
 // 사전 표에 없는 경우
 // 1. 새 값 생성 및 쓰기
 List<Integer> previousValue = this.dictionary.get(this.previousCode);
 List<Integer> newValue = this.concat(previousValue, previousValue.get(0));
 this.writeToOutput(newValue, byteArrayOutputStream);
 // 2. 사전 테이블에 새 코드 추가
 this.addToDictionary(newValue);
 this.previousCode = code;
 }
 }
 code = this.readLzwCode(inputStream);
 }
 return byteArrayOutputStream;
 }
 }
 /**
 * 사전 테이블 및 관련 변수 초기화
 */
 private void initialize() {
 this.dictionary = new HashMap<>();
 for (int i = 0; i < 258; i++) {
 this.dictionary.put(i, Collections.singletonList(i));
 }
 this.bitsPerCode = MIN_BIT_SIZE;
 this.maxCode = this.maxValueOf(this.bitsPerCode);
 this.previousCode = 0;
 }
 private void writeToOutput(List<Integer> value, FastByteArrayOutputStream byteArrayOutputStream) throws IOException {
 for (Integer integer : value) {
 byteArrayOutputStream.write(integer);
 }
 }
 private List<Integer> concat(List<Integer> integerList, Integer integer) {
 List<Integer> newValue = new ArrayList<>(integerList);
 newValue.add(integer);
 return newValue;
 }
 private void addToDictionary(List<Integer> integerList) {
 this.dictionary.put(this.nextValidCode, integerList);
 this.nextValidCode += 1;
 this.increaseBitsPerCodeOrRestIfNeeded();
 }
 private void increaseBitsPerCodeOrRestIfNeeded() {
 if (this.nextValidCode >= this.maxCode) {
 this.bitsPerCode += 1;
 this.maxCode = this.maxValueOf(bitsPerCode);
 }
 }
 /**
 * 입력 스트림을 차례로 LZW 인코딩된 결과로 변환합니다.
 *
 * @param inputStream  
 * @return LZW인코딩 결과
 * @throws IOException 입력 스트림에서 데이터를 읽을 때 예외
 */
 private int readLzwCode(final InputStream inputStream) throws IOException {
 int read = inputStream.read();
 // 현재 입력 스트림에 읽을 수 있는 데이터가 없는 경우, EOI를 직접 반환합니다._CODE
 if (read < 0) {
 return EOI_CODE;
 }
 // 마지막 바이트의 결과를 8비트씩 이동하여 현재 값과 병합합니다.
 int var1 = (this.remainValueOfPreviousByte << 8) | read;
 // 이미 읽은 바이트 수: 이전 바이트에서 남은 비트 수입니다.+ 현재 바이트의 비트 8
 int var2 = this.bitsOfPreviousByte + 8;
 // 이미 읽은 바이트 비트 수가 인코딩된 결과에 해당하는 비트 수보다 여전히 적은 경우 다음 값을 읽어야 합니다.
 // bitsPerCode최대값은 12이며, 다음 조건이 두 번 이상 실행되므로 동안 루프가 필요하지 않습니다.
 if (var2 < this.bitsPerCode) {
 read = inputStream.read();
 if (read < 0) {
 return EOI_CODE;
 }
 // 이미 읽은 값은 8비트 왼쪽으로 이동하여 새로 읽은 값과 병합됩니다.
 var1 = (var1 << 8) | read;
 // 이미 읽은 바이트 수+8
 var2 += 8;
 }
 // 읽을 값에서 제외할 비트 수입니다.
 int var3 = var2 - this.bitsPerCode;
 int code = (var1 >> var3) & this.bitmaskFor(this.bitsPerCode);
 // 마스킹하여 다음 인코딩 결과에 넣어야 하는 읽기 바이트의 값을 가져옵니다.
 this.remainValueOfPreviousByte = var1 & this.bitmaskFor(var3);
 this.bitsOfPreviousByte = var3;
 return code;
 }
 /**
 * 지정된 값에 해당하는 마스크를 가져옵니다.
 *
 * @param value 특정 값
 * @return  
 */
 private int bitmaskFor(int value) {
 return this.maxValueOf(value);
 }
 /**
 * 지정된 비트 수의 최대값을 반환합니다.
 *
 * @param bitsLength bit 
 * @return 현재 비트 번호로 표현할 수 있는 최대값입니다.
 */
 private int maxValueOf(int bitsLength) {
 return (1 << bitsLength) - 1;
 }
 public static void main(String[] args) throws IOException {
 LZWDecoder decoder = new LZWDecoder();
 byte[] bytes = new byte[]{-128, 1, -32, 64, -128, 68, 8, 12, 6, -128, -128};
 InputStream inputStream = new ByteArrayInputStream(bytes);
 OutputStream outputStream = decoder.decode(inputStream);
 System.out.println(Arrays.toString(((FastByteArrayOutputStream) outputStream).toByteArray()));
 }
}

원격 감지 이미지 슬라이싱 무료 5: TIFF 파일에 LZW 알고리즘 적용하기

TIFF 파일용 LZW 사용 요구 사항

미리 알아야 할 사항

TIFF 파일용 LZW 구현 3.

난이도 분석

코딩을 효과적으로 관리하는 방법

무기한 길이 인코딩 결과를 처리하는 방법

압축 프로세스

데이터 인코딩

무기한 길이 인코딩 결과를 바이트로 변환하기

압축 해제 프로세스

데이터 디코딩

바이트에서 LZW로 인코딩 결과

전체 코드

압축

압축 해제

Read next

자바 개요 및 환경 3

VisionPro 개발 - 객체가 항상 카메라를 향하도록 만드는 방법은 무엇인가요?

답변:chatGPT 무료 버전_chatGPT 무료 버전_chatGPT 무료 사용 방법_chatGPT 무료 웹 버전

엑스트라넷 프로그램에 대한 Linux 인트라넷 액세스

유형 "스크립트" 튜토리얼 유형 "스크립트" 심볼

Linux에서 HTTP 프록시 서버 설정하기