-
-
Notifications
You must be signed in to change notification settings - Fork 842
feat: add anonymous feature-usage telemetry #1928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,231 @@ | ||
| # Copyright (c) Facebook, Inc. and its affiliates. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. minor nit: copyright here is wrong, let's take this out or replace with more appropriate |
||
| # | ||
| # This source code is licensed under the MIT license found in the | ||
| # LICENSE file in the root directory of this source tree. | ||
| """Anonymous feature-usage telemetry for bitsandbytes. | ||
|
|
||
| Sends one HEAD request per distinct feature per process via | ||
| `huggingface_hub.utils.send_telemetry()`. Data lands in the Hugging Face | ||
| Hub telemetry index under `path_prefix == "/api/telemetry/bitsandbytes/"` | ||
| and informs maintenance and deprecation decisions. | ||
|
|
||
| What is collected | ||
| - Session fingerprint (once per process, first feature use): | ||
| bnb version, OS name/version, CPU arch, glibc version, Python/torch | ||
| versions, accelerator vendor/name/arch/count. | ||
| - Per-feature events: feature name plus feature-specific metadata | ||
| (e.g. `quant_type="nf4"`, `bits="8"`, `paged="true"`). | ||
|
|
||
| What is NOT collected | ||
| Model names, file paths, parameter shapes, user identifiers, training | ||
| data, gradient values, or any value derived from user input. | ||
|
|
||
| Automatically disabled when running under pytest (detected via | ||
| `pytest` in `sys.modules` or `PYTEST_CURRENT_TEST` env var) so that test | ||
| runs in CI and locally do not pollute the real-usage stream. | ||
|
|
||
| Opt-out (any of the following env vars disables all telemetry): | ||
| - BNB_DISABLE_TELEMETRY=1 (bitsandbytes only) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure we need to roll our own; it's cleaner to just reuse the existing? I don't see a use case for e.g. opting out of HF Hub telemetry but still opting in for BNB? |
||
| - HF_HUB_DISABLE_TELEMETRY=1 (all HF libraries) | ||
| - HF_HUB_OFFLINE=1 (all HF libraries) | ||
|
|
||
| End-to-end verification: | ||
| Set `BNB_TELEMETRY_TAG=<some-id>` before importing bitsandbytes and the | ||
| value is attached as `bitsandbytes.tag` on every event. Use this to | ||
| correlate a single run's events in ES. | ||
|
Comment on lines
+32
to
+35
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems unnecessary/overkill? |
||
|
|
||
| No-ops silently if `huggingface_hub` is not installed, and never raises. | ||
|
|
||
| Keys are namespaced under `bitsandbytes.*` in the resulting | ||
| `metadata.bitsandbytes.*` fields so they do not collide with fields logged | ||
| by other libraries in the shared telemetry index. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| import os | ||
| import platform | ||
| import sys | ||
| from typing import Optional | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| _REPORTED: set[str] = set() | ||
| _FINGERPRINT: Optional[dict[str, str]] = None | ||
|
|
||
| _TRUTHY = frozenset({"1", "true", "yes", "on"}) | ||
|
|
||
|
|
||
| def _is_pytest() -> bool: | ||
| """Detect whether we are running inside a pytest process. | ||
|
|
||
| Telemetry is suppressed during test runs so that CI and local test | ||
| invocations don't pollute the real-usage stream. Tests that want to | ||
| assert on telemetry behavior monkey-patch this function to return False. | ||
| """ | ||
| return "pytest" in sys.modules or "PYTEST_CURRENT_TEST" in os.environ | ||
|
Comment on lines
+60
to
+67
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would consider looking at other env variables and not bother with the Most CI platforms will have an env var like |
||
|
|
||
|
|
||
| def _is_disabled() -> bool: | ||
| for var in ("BNB_DISABLE_TELEMETRY", "HF_HUB_DISABLE_TELEMETRY", "HF_HUB_OFFLINE"): | ||
| if os.environ.get(var, "").strip().lower() in _TRUTHY: | ||
| return True | ||
| if _is_pytest(): | ||
| return True | ||
| return False | ||
|
|
||
|
|
||
| def _os_info() -> tuple[str, str]: | ||
| os_name = platform.system() | ||
| os_name = {"Darwin": "macOS"}.get(os_name, os_name) | ||
| if os_name == "Windows": | ||
| try: | ||
| build = sys.getwindowsversion().build | ||
| os_version = f"11 (build {build})" if build >= 22000 else f"10 (build {build})" | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems fragile and also ignores Windows Server etc. |
||
| except Exception: | ||
| os_version = platform.release() | ||
| elif os_name == "macOS": | ||
| os_version = platform.mac_ver()[0] or platform.release() | ||
| else: | ||
| os_version = platform.release() | ||
| return os_name, os_version | ||
|
|
||
|
|
||
| def _accel_info() -> dict[str, str]: | ||
| info: dict[str, str] = {} | ||
| try: | ||
| import torch | ||
| except ImportError: | ||
|
Comment on lines
+97
to
+99
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. torch is already a pretty hard dep, this shouldnt need to be caught |
||
| info["bitsandbytes.accel"] = "unknown" | ||
| return info | ||
|
|
||
| try: | ||
| if torch.cuda.is_available(): | ||
| vendor = "amd" if getattr(torch.version, "hip", None) else "nvidia" | ||
| info["bitsandbytes.accel"] = vendor | ||
| info["bitsandbytes.accel_count"] = str(torch.cuda.device_count()) | ||
| props = torch.cuda.get_device_properties(0) | ||
| info["bitsandbytes.accel_name"] = props.name | ||
| if vendor == "nvidia": | ||
| info["bitsandbytes.accel_arch"] = f"sm_{props.major}{props.minor}" | ||
| else: | ||
| info["bitsandbytes.accel_arch"] = getattr(props, "gcnArchName", "unknown") | ||
| return info | ||
|
|
||
|
Comment on lines
+104
to
+115
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This only looks at the first device; I'm not sure but we may be interested when there's multiple devices and they're different. I'm wondering if for that maybe we just add some sort of flag to tell us whether it is a heterogeneous system or not. Likely it is, but may be valuable to find out otherwise. Let's grab device 0's SM count and memory. We don't really need the name. So this should be for both AMD and NVIDIA the |
||
| if hasattr(torch, "xpu") and torch.xpu.is_available(): | ||
| info["bitsandbytes.accel"] = "xpu" | ||
| info["bitsandbytes.accel_count"] = str(torch.xpu.device_count()) | ||
| try: | ||
| info["bitsandbytes.accel_name"] = torch.xpu.get_device_properties(0).name | ||
| except Exception: | ||
| pass | ||
| return info | ||
|
|
||
| if hasattr(torch.backends, "mps") and torch.backends.mps.is_available(): | ||
| info["bitsandbytes.accel"] = "mps" | ||
| return info | ||
|
|
||
| if hasattr(torch, "hpu") and torch.hpu.is_available(): | ||
| info["bitsandbytes.accel"] = "hpu" | ||
| return info | ||
| except Exception: | ||
| pass | ||
|
|
||
| info["bitsandbytes.accel"] = "cpu" | ||
| return info | ||
|
|
||
|
|
||
| def _fingerprint() -> dict[str, str]: | ||
| global _FINGERPRINT | ||
| if _FINGERPRINT is not None: | ||
| return _FINGERPRINT | ||
|
|
||
| try: | ||
| import bitsandbytes | ||
|
|
||
| version = bitsandbytes.__version__ | ||
| except Exception: | ||
| version = "unknown" | ||
|
|
||
| os_name, os_version = _os_info() | ||
| info = { | ||
| "bitsandbytes.version": version, | ||
| "bitsandbytes.os": os_name, | ||
| "bitsandbytes.os_version": os_version, | ||
| "bitsandbytes.arch": platform.machine(), | ||
| "bitsandbytes.python": platform.python_version(), | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is redundant too, huggingface_hub likely includes Python version already |
||
| } | ||
| if os_name == "Linux": | ||
| try: | ||
| libc_name, libc_ver = platform.libc_ver() | ||
| if libc_name: | ||
| info["bitsandbytes.libc"] = f"{libc_name}-{libc_ver}" | ||
| except Exception: | ||
| pass | ||
| try: | ||
| import torch | ||
|
|
||
| info["bitsandbytes.torch"] = torch.__version__ | ||
| except ImportError: | ||
| pass | ||
|
Comment on lines
+166
to
+171
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is redundant, does hf hub automatically collect this? |
||
|
|
||
| info.update(_accel_info()) | ||
|
|
||
| _FINGERPRINT = info | ||
| return info | ||
|
|
||
|
|
||
| def report_feature(feature: str, details: Optional[dict[str, object]] = None) -> None: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think for more clarity we should just name this |
||
| """Report that a bitsandbytes feature was used. | ||
|
|
||
| Fires at most once per `feature` per process. Subsequent calls with the | ||
| same `feature` are O(1) no-ops. | ||
|
|
||
| Args: | ||
| feature: Short feature name. Becomes the final URL path segment: | ||
| `/api/telemetry/bitsandbytes/{feature}` (so it appears as | ||
| `path_filename` in ES queries). | ||
| details: Optional feature-specific key/value metadata. Keys without a | ||
| `bitsandbytes.` prefix are prefixed automatically. | ||
| """ | ||
| if feature in _REPORTED: | ||
| return | ||
| _REPORTED.add(feature) | ||
|
|
||
| if _is_disabled(): | ||
| return | ||
|
Comment on lines
+192
to
+197
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We may want to to de-duping more granular than just the "feature" name as it is. But maybe we just name the features differently in that case. So that's more of a minor nit. Should we add to _REPORTED even when disabled? Seems to me we should just exit right away. |
||
|
|
||
| try: | ||
| from huggingface_hub.utils import send_telemetry | ||
| except ImportError: | ||
| return | ||
|
|
||
| fingerprint = _fingerprint() | ||
| user_agent = dict(fingerprint) | ||
| user_agent["bitsandbytes.feature"] = feature | ||
| if details: | ||
| for k, v in details.items(): | ||
| key = k if k.startswith("bitsandbytes.") else f"bitsandbytes.{k}" | ||
| user_agent[key] = str(v) | ||
|
|
||
| tag = os.environ.get("BNB_TELEMETRY_TAG", "").strip() | ||
| if tag: | ||
| user_agent["bitsandbytes.tag"] = tag | ||
|
Comment on lines
+212
to
+214
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as comment earlier, seems unnecessary. |
||
|
|
||
| try: | ||
| send_telemetry( | ||
| topic=f"bitsandbytes/{feature}", | ||
| library_name="bitsandbytes", | ||
| library_version=fingerprint.get("bitsandbytes.version", "unknown"), | ||
| user_agent=user_agent, | ||
| ) | ||
| except Exception as e: | ||
| logger.debug("bitsandbytes telemetry send failed: %s", e) | ||
|
|
||
|
|
||
| def _reset_for_testing() -> None: | ||
| """Clear module state. Intended for use in test fixtures only.""" | ||
| global _FINGERPRINT | ||
| _REPORTED.clear() | ||
| _FINGERPRINT = None | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,6 +11,7 @@ | |
| import torch.nn.functional as F | ||
|
|
||
| import bitsandbytes as bnb | ||
| from bitsandbytes._telemetry import report_feature | ||
| from bitsandbytes.functional import ( | ||
| QuantState, | ||
| _convert_weight_packed_for_cpu, | ||
|
|
@@ -97,6 +98,7 @@ def __init__( | |
| ) | ||
| self.norm = torch.nn.LayerNorm(embedding_dim, device=device) | ||
| GlobalOptimManager.get_instance().register_module_override(self, "weight", {"optim_bits": 32}) | ||
| report_feature("embedding", {"variant": "stable"}) | ||
|
|
||
| def reset_parameters(self) -> None: | ||
| torch.nn.init.xavier_uniform_(self.weight) | ||
|
|
@@ -179,6 +181,7 @@ def __init__( | |
| device=device, | ||
| ) | ||
| GlobalOptimManager.get_instance().register_module_override(self, "weight", {"optim_bits": 32}) | ||
| report_feature("embedding", {"variant": "standard"}) | ||
|
|
||
| def reset_parameters(self) -> None: | ||
| torch.nn.init.xavier_uniform_(self.weight) | ||
|
|
@@ -239,6 +242,15 @@ def __new__( | |
| self.bnb_quantized = bnb_quantized | ||
| self.data = data | ||
| self.module = module | ||
| report_feature( | ||
| "params_4bit", | ||
| { | ||
| "quant_type": quant_type, | ||
| "blocksize": blocksize, | ||
| "compress_statistics": compress_statistics, | ||
| "quant_storage": str(quant_storage).replace("torch.", ""), | ||
| }, | ||
|
Comment on lines
+245
to
+252
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Starting to think we don't need this here, would prefer we just keep Linear4bit and Linear8bitLt but remove this on Params4bit/Int8Params. |
||
| ) | ||
| return self | ||
|
|
||
| def __getstate__(self): | ||
|
|
@@ -607,6 +619,16 @@ def _save_to_state_dict(self, destination, prefix, keep_vars): | |
| destination[prefix + "weight." + k] = v if keep_vars else v.detach() | ||
|
|
||
| def forward(self, x: torch.Tensor): | ||
| report_feature( | ||
| "linear_4bit", | ||
| { | ||
| "quant_type": getattr(self.weight, "quant_type", "unknown"), | ||
| "blocksize": getattr(self.weight, "blocksize", 0), | ||
| "compress_statistics": getattr(self.weight, "compress_statistics", False), | ||
| "input_dtype": str(x.dtype).replace("torch.", ""), | ||
| "compute_dtype": (str(self.compute_dtype).replace("torch.", "") if self.compute_dtype else "auto"), | ||
| }, | ||
| ) | ||
|
Comment on lines
621
to
+631
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would prefer we do this in |
||
| fix_4bit_weight_quant_state_from_module(self) | ||
| quant_state = self.weight.quant_state | ||
|
|
||
|
|
@@ -732,6 +754,7 @@ def __new__( | |
| obj.CB = CB | ||
| obj.SCB = SCB | ||
| obj.has_fp16_weights = has_fp16_weights | ||
| report_feature("int8_params", {"has_fp16_weights": has_fp16_weights}) | ||
| return obj | ||
|
|
||
| def _quantize(self, device): | ||
|
|
@@ -855,6 +878,7 @@ def __init__(self, num_embeddings, embedding_dim, device=None, dtype=None): | |
| self.dtype = self.weight.data.dtype | ||
|
|
||
| self.weight = Int8Params(self.weight.data, has_fp16_weights=False, requires_grad=False) | ||
| report_feature("embedding", {"variant": "8bit"}) | ||
|
|
||
| def _save_to_state_dict(self, destination, prefix, keep_vars): | ||
| raise NotImplementedError("Saving Embedding8bit module is not implemented") | ||
|
|
@@ -926,6 +950,7 @@ def __init__( | |
| f"Embedding size {embedding_dim} is not divisible by block size {blocksize}. " | ||
| "This will lead to slow inference.", | ||
| ) | ||
| report_feature("embedding", {"variant": "4bit", "quant_type": quant_type}) | ||
|
|
||
| def _forward_with_partial_dequantize(self, input: Tensor): | ||
| assert self.embedding_dim % self.weight.quant_state.blocksize == 0 | ||
|
|
@@ -1178,6 +1203,14 @@ def to(self, *args, **kwargs): | |
| return result | ||
|
|
||
| def forward(self, x: torch.Tensor): | ||
| report_feature( | ||
| "linear_8bit", | ||
| { | ||
| "has_fp16_weights": self.state.has_fp16_weights, | ||
| "threshold": self.state.threshold, | ||
| "input_dtype": str(x.dtype).replace("torch.", ""), | ||
| }, | ||
| ) | ||
|
Comment on lines
+1206
to
+1213
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment as with |
||
| self.state.is_training = self.training | ||
| if self.weight.CB is not None: | ||
| self.init_8bit_state() | ||
|
|
@@ -1199,6 +1232,7 @@ def __init__(self, input_features, output_features, bias=True, device=None): | |
| super().__init__(input_features, output_features, bias, device) | ||
| self.outlier_dim = None | ||
| self.is_quantized = False | ||
| report_feature("outlier_aware_linear") | ||
|
|
||
| def forward_with_outliers(self, x, outlier_idx): | ||
| raise NotImplementedError("Please override the `forward_with_outliers(self, x, outlier_idx)` function") | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's simplify this: